After initial research, FortySeven engineers outlined the following key requirements for the required solution: Extensive reporting on application usage data collected from users' devices. The amount of reported data is expected to increase. Ability to calculate existing/new stat types based on all collected data. In other words, data storage must be provided for life. Ability to filter calculated statistics and create charts based on specific criteria. The ability to automatically perform statistical calculations based on specific periods.
FortySeven developers implemented a multi-stage data line:
To perform all statistical manipulations, you need a lot of disk space and CPU power. This is why the Amazon Web Services platform was chosen because it provides a complete set of services and resources required for this solution at a reasonable cost. AWS technology provides a relatively inexpensive storage space for an almost unlimited amount of data. It supports the automated deployment of Hadoop groups, which was necessary to fulfill the customer's mission. FortySeven developers implemented a multi-stage data line.
The Amazon Web Services platform enables the implementation of a complex significant data pipeline that avoids the main restricted access of any big data processing:
Amazon S3 is one of the cheapest value-added flagship store solutions in its class, and the EMR cluster provides an easy and highly automated way to instantly deploy and build a large Hadoop cluster, significantly reducing the cost of development efforts.