Extracting Data from your Open Source Communities
Open source communities are filled with huge amounts of data just waiting to be analyzed. Getting this data into a format that can be easily used for analysis may seem intimidating at first, but there are some very useful open source tools that make this task relatively easy. The primary tools used in this talk are the open source Metrics Grimoire tools that take data from various community sources and store it in a database where it can be easily queried and analyzed.
This talk will cover: CVSAnalY to gather and analyze source code repository data; MLStats to gather and analyze mailing list data; other Metrics Grimoire tools for bug trackers, IRC, Wikis and more; and Gource to visualize source code repository data.
The audience for this talk is anyone who is interested in learning new ways to extract data from open source communities. This talk will be interesting for people participating in communities, data geeks, researchers and others who are interested in learning more about communities. The audience should have basic data science knowledge, including running database queries and basic data manipulation tasks.