As sales grew and the company attracted more online and social media attention, the client found that managing the large volume of data being generated was difficult and problematic. An immense amount of data was being generated across various sources: Sales transactions, merchandise inventory, social media and click stream data. This explosion of data overwhelmed the client. Needing a cost effective way to manage large amounts of data to effectively forecast sales and derive other critical business insights, the client decided take ownership of the situation through immediate action.
The solution included implementing Hadoop HDFS and Hive data cluster to satisfy the client’s need to find a cost effective way to store data. Implementation of these solutions also made it easier for business users within the organization to access data when necessary. As data was generated from various sources, some of the data was stored in file format. The team proceeded to inject these data files into Hadoop- making the data immediately accessible online.
In terms of data analytics, Hadoop and R were used. These enabled effective data analysis, statistical computing and data visualization. Various regression models were used in this case. Industry best-practices in data architecture were also applied to ensure that the solution was scalable and extendable to assure that cost, quality, scope and timeline criteria were met.
- Successfully implemented a 100 nodes Hadoop cluster housing 10TB of data and growing with 1TB a year.
- Integration of Hadoop as part of the enterprise architecture for data management resulted in a scalable architecture that synchronizes with data volume. This reduces excessive storage waste and aids in achieving long-term cost reduction for the enterprise.
- New data warehouse enabled retailer to reduce time required to launch new business initiatives, meet regulatory compliance with certifiable, accurate financial and operational data
- Big data technologies implemented on commodity hardware resulted in 2M in savings - in contrast to the higher cost associated with other traditional RDBMS solutions
- Enabled advanced analytics to obtain a rich and consolidated view of customers and achieve effective sales forecasting results
- Enabled advanced, predictive analytics for business users in order to facilitate inventory management
- Alignment and unity of all business units through the provision of real-time enterprise data