Analyze terabytes of OS code with one query
How to leverage the code shared on GitHub with ease
- Track: Lightning Talks
- Room: H.2215 (Ferrer)
- Day: Sunday
- Start: 13:00
- End: 13:15
Google has made available a BigQuery copy of most open source code shared in GitHub. This allows any interested party to analyze 5 years of GitHub metadata and more than 42 terabytes of code easily. In this session we'll cover how to leverage this data - to understand the community around any language or project. With this, design requests and decisions can be made looking at the actual patterns discoverable through analytical methods.
Google has made available a BigQuery copy of most open source code shared in GitHub. This allows any interested party to analyze 5 years of GitHub metadata and more than 42 terabytes of code easily. In this session we'll cover how to leverage this data - to understand the community around any language or project. With this, design requests and decisions can be made looking at the actual patterns discoverable through analytical methods.
During a lighting talk we can quickly see:
- How is this run.
- How coding patterns have changed through time.
- Guiding your project design decisions based on actual usage of your APIs.
- How to request features based on data.
- The most effective phrasing to request changes.
- Effects of social media on a project's popularity.
- Who starred your project - and what other projects interest them.
- Measuring community health.
- Running static code analysis at scale.
Speakers
Felipe Hoffa |