Graph-based analysis of JavaScript source code repositories
- Track: Graph Processing devroom
- Room: H.2214
- Day: Saturday
- Start: 16:30
- End: 17:05
JavaScript is one of the decade’s most trending languages. It ranked #1 in popularity in Stack Overflow questions and is consistently featured in the top 10 languages of the TIOBE Index. Originally intended for client-side scripting, the language is now widely used to build complex desktop applications, write server-side code and program IoT devices. The latest standards of the language are released yearly under the ECMAScript trademark and contain sophisticated features and syntactical constructs.
Static analysis is a software testing approach that is performed without compiling and executing the program itself. This allows developers to catch programming errors before building, testing and deploying the code. There is a wide range of static analysis tools: linters and code style analysers repeatedly perform checks in IDEs, while more complex analyzers, such as type checkers, run as part of the continuous integration (CI) process.
As JavaScript is a dynamic language, static analysis approaches are particularly useful: they can detect erroneous type usages that would not be revealed by building the code, but only occur during thorough testing or even worse, at production. Thanks to the popularity of the language, there are already numerous approaches for static analysis available, such as Tern, Facebook’s Flow and TAJS. However, none of these fitted all of our requirements:
- Utilise both linter-style and complex global analysis rules.
- Evaluate rules in real-time, i.e. in sub-seconds upon each “save” operation.
- Allow users to easily extend the analysis rules.
As none of the current approaches satisfied these requirements, we built our own solution that uses a property graph query engine to represent the code graphs used for analysis and graph queries to evaluate the analysis rules. Compared to other static analysis frameworks, the novelty of our solution is twofold:
- We continuously maintain code graphs based on the latest changes on the source code.
- We use the declarative openCypher language to define the static analysis and graph maintenance rules.
Using declarative queries, our tool is able to perform the complex analysis queries quickly, including:
- Detecting asynchronous method calls with missing await statements (a common issue in the callback hell).
- Detecting unreachable code, i.e. code parts that cannot be reached through the control flow.
The analysis can be easily extended by custom analysis rules defined in the openCypher language. Building the system on openCypher also allows us to use different query engines: both mature databases, such as Neo4j, and also experimental engines, such as our own ingraph engine. The latter is our research prototype that supports live query evaluation for Cypher queries, which allows near instant answers even for complex analysis rules.
In this talk, we give an overview of the steps involved in transforming the source code file to a syntax graph and converting it to a call flow graph. We demonstrate how openCypher queries can be used to capture complex analysis rules in a concise way, and how ingraph allows us to continuously evaluate these queries.
Intended audience: Developers of static analysis tools, users looking for a flexible analysis framework
Speaker biography.
Gabor Szarnyas is a researcher working on graph processing techniques. His core research areas are live graph pattern matching, benchmarking graph queries, and analyzing large-scale networks. His main research project is ingraph, an openCypher-compatible query engine supporting live query evaluation. His research team was the first to publish a formalisation that captures the semantics of a core subset of the openCypher language.
Gabor works at the Budapest University of Technology and Economics, teaching system modelling and database theory. He conducted research visits at the University of York, McGill University and the University of Waterloo. He is a member of the openCypher Implementers Group and the LDBC Social Network Benchmark task force. He received 1st prize in the ACM Student Research Competition at the MODELS 2016 conference. He is also a frequent speaker at industrial conferences (FOSDEM, GraphConnect) and meetups (openCypher meetup NYC, Budapest Neo4j meetup).
Speakers
Gabor Szarnyas |