When it comes to open source big data processing, Hadoop is no longer the only name in the game. Apache Spark is a general purpose distributed data processing tool that allows users to process gigantic datasets across many nodes, coordinating the processing so that users can concentrate on writing their queries in their language of choice. At the beginning of this year, we announced a new world record in data processing set by Apache Spark, 100 TB of data in just 23 minutes. In the months that followed, interest in Apache Spark has not slowed, and the project has gained many new contributors and adopters.
The Blender Foundation is on a mission “to build a free and open source complete 3D creation pipeline for artists and small teams.” This year we’ve seen the power of Blender in the mix of Blender-related articles we’ve run on Opensource.com. Writer and Blender aficionado Jason van Gumster (author of Blender for Dummies) shared the majority of those stories, including reports from the recent Blender Conference in Amsterdam.
If you spend a lot of time managing files on your computer, you’re going to want a file manager that suit your needs and gives you features that let you quickly and easily take control of your file system. Dolphin, the default file manager in many KDE-based distributions, is a powerful tool to help you organize files. For more on Dolphin, take a look at Opensource.com community moderator David Both’s comprehensive review and guide to the Dolphin file manager from earlier this year.
The world of version control sure has changed since git entered the scene 10 years ago as an open source alternative to BitKeeper for managing the Linux kernel’s source code. Since then, git has rapidly become the most popular tool for tracking changes to files, and not just for code. Git helps track changes to files where revisioning, branching, and collaborative development can help improve the workflow of a project. Are you still working with an older source code manager, but thinking of moving to git? Here are some great tips and resources for making the move.
To borrow from our review of this open source team chat alternative:
Piwik is an open source alternative to Google Analytics, and according to writer Scott Nesbitt, chances are it packs the features you need.
Nesbitt writes: “Those features include metrics on the number of visitors hitting your site, data on where they come from (both on the web and geographically), from what pages they leave your site, and the ability to track search engine referrals. Piwik also has a number of reports and you can customize the dashboard to view the metrics that you want to see. To make your life easier, Piwik integrates with over 65 content management, ecommerce, and online forum systems like WordPress, Magneto, Joomla!, and vBulletin using plugins. With anything else, you just need to add a tracking code to a page on your site. A number of web hosting firms offer Piwik as part of their one-click install packages. You can test drive Piwik or use a hosted version.”
Fun fact: Maker of the LulzBot 3D printer, Aleph Objects, uses Piwik to run their analytics.
In the era of big data, now may be the time to learn R, which has become the programming language of choice for data scientists and others interested in statistical computing and graphics, and is touted by influencers in big data like Revolution Analytics. Earlier this year, the R Consortium became a Linux Foundation Collaborative project, created to provide support for the development of R-Hub, a new code-hosting platform for developing and distributing packages for R.
SugarCRM is the 800-pound gorilla in the open source customer relationship management space, and has previously been featured as one of our top 5 CRM tools. The community edition of SugarCRM can be used out of the box as a complete solution for organizations hoping to do a better job of keeping their contacts manageable, or who want to turn a list of names into something actionable. Complete with huge list of features and a pluggable infrastructure that allows for even more customization, SugarCRM is a great solution for organizations that want to get a handle on their contacts.
In a nutshell, Vagrant is a command-line tool for launching and configuring virtual machines. With Vagrant, environments are reproducible and portable, and the data that defines the environment is stored in text files, making it easy to version control your environments and manage your virtual machines just as you would code. Vagrant allows you to set up development environments on your local machine that are nearly identical to your production environment, regardless of what your host operating system is. Plus, learning how to get started with Vagrant is easy.
Thanks to Jason Baker for his help on this article.