Big Data startup Tresata has released a set of algorithms that simplify machine learning and statistical analysis in large-scale Hadoop environments. Named “ganitha” after the Sanskrit word for mathematics, the library is built on the Scalding API for Apache Cascading, a Java-based data-driven development framework.
The bundle includes Tresata’s homegrown K-Means clustering implementation and Apache Mahout vector integration. Abhishek Mehta, the founder of the company and a regular on SiliconANGLE’s theCube, wrote in a blog post that his team intends to extend ganitha with collaboration with the open source community.
Last month, Tresata entered a strategic partnership with Advanced Valuation Analytics Corporation (AVAC) to create analytics solutions for financial service providers. Mehta stated at the time that “only when data & technology experts work with the best scientists can you automate the discovery of knowledge from massive amounts of data.” He added that “collaborating with AVAC’s thought leaders allows Tresata to continue to develop the most advanced analytics applications that help monetize big data.”.
The partnership between Tresata and AVAC was announced shortly the former relaunched with refreshed branding and a new solutions lineup. Mehta told SiliconANGLE NewsDesk host Kristin Feledy that his company shifted its focus from driving Hadoop adoption to helping enterprises monetize their Big Data in an effort to address changing market dynamics. He stressed that having a consolidated, business-oriented Big Data strategy is essential for any organization that wishes to gain insights into consumer trends.
“You have to architect around a technology architecture that scales, which Hadoop as you all know does, allows you to handle both structured and unstructured data,“ Mehta elaborated. “Our application suite, the power it produces for every dollar of hardware and the technical capability and prowess of our algorithms are unmatched in the industry, on any other system in the market.”