Code aware search
Save time combing through usage results with a semantic search that ranks definitions first over usages or variables names. Sign up for Bitbucket Cloud to take it for a spin.
The search for code search is finally over: Bitbucket Cloud is launching code aware search, specifically built for teams who have many repos or large code bases.
What makes Bitbucket Cloud’s search “code aware”? Rather than simply indexing your code as text, we built a semantic search that has our systems do the grunt work for you. Bitbucket Cloud analyzes your code syntax, ensuring definitions matching your search term are prioritized over usages and variable names. Assuming your team is re-using code effectively, the ratio of usages to definitions will increase as your codebase grows, making this a big time saver on larger projects.
For example, if you search for “FastHashMap”, which document would you want to appear first?
public class FastHashMap {
/* ... */
}
or
import foo.bar.FastHashMap
public class SomeOtherClass {
public void doSomething() {
FastHashMap fastHashMap = new FastHashMap();
}
}
You’d prefer the class definition, right? Let’s take a deeper look at how we built our code aware search to provide the most relevant search results at a fast pace.
How code search works in Bitbucket Cloud
Search indices built using traditional text indexers will usually return the usage result first because it contains a higher number of exact matches for your search term. In code bases where the same class or function is used many times, developers are often left trawling through page after page of usage results trying to hunt down the definition.
We took a different approach: by boosting the definitions matching your search term, the result you want is likely to rank much higher (usually #1) in the search results. Our algorithm boosts definitions for a wide range of type categories including classes, functions, enums, structs, and interfaces. We prioritized building a code aware search scoped to team and user accounts over a global search functionality. This way, we hope to quickly give our users the relevant results they want instead of the hassle of checking out a repo locally and searching using an IDE.
To compare this live, you can search for the common class “QueryBuilders” on the Elasticsearch repo. In GitHub, it shows up as the 6th result on the 18th page (at time of writing). In Bitbucket Cloud, the class definition shows up as the first result.
Languages, filters, and operators
Code aware search outperforms traditional search approaches for statically typed languages like Java that tend to repeat type names when importing, declaring, and instantiating types. However Bitbucket Cloud’s code aware search is also highly effective for a range of other popular languages including JavaScript, Python, Ruby, and PHP, among others.
Since code aware search is built for source code, we also index . and _ that are commonly used in identifiers. This means you can get more precise results for compound search terms such as class, function, and variable names like “foo_bar.baz”.
Additionally, we allow you to restrict search results by using modifiers and operators. You can use modifiers to filter by a particular language or file extension (like “ext:css” or “lang:ruby”) or limit search to specific repos (repo:elasticsearch). Projects can use operators (like AND, OR, and NOT) to narrow down or broaden results in case you get too many.
For a full list of the capabilities and search query considerations with code search in Bitbucket Cloud, check out our documentation.
Try Bitbucket Cloud’s Code Aware Search
If you’re ready to use a fast and relevant code search, sign up for a Bitbucket Cloud account, create a repository, and index your code. If you’re already a Bitbucket customer, you can find code search in your sidebar and further documentation on it here.
Have more specific questions about this post? Reach out to us on Twitter to get the information you need.