Skip to content
Tiatra, LLCTiatra, LLC
Tiatra, LLC
Information Technology Solutions for Washington, DC Government Agencies
  • Home
  • About Us
  • Services
    • IT Engineering and Support
    • Software Development
    • Information Assurance and Testing
    • Project and Program Management
  • Clients & Partners
  • Careers
  • News
  • Contact
 
  • Home
  • About Us
  • Services
    • IT Engineering and Support
    • Software Development
    • Information Assurance and Testing
    • Project and Program Management
  • Clients & Partners
  • Careers
  • News
  • Contact

Data Management on the Cloud Leveraging AWS

The history of data can be divided into two eras: pre-big data and post-big data.

In the pre-big data era, data was mostly structured and exchanged between enterprises through standard mechanisms such as network data mover (NDM). The need for near real-time insights was limited, and data extraction and transformation were batch-oriented and scheduled during non-peak hours to reduce MIPS (millions of instructions per second) usage and disruption to online production transactions. 

Also, data formats were limited, the most common format being delimited flat files with headers and trailers. Both headers and trailers stored important information such as data arrival time, data producer information, and the number of records in the file. 

Moreover, relational database management systems (RDBMs) — such as DB2, hierarchical databases such as IMSDB, flat files and custom extract, transform, load (ETL) logic within COBOL or PL/I — were sufficient to address data ingestion, analysis, and storage. Since sources of data generation were limited, it was easier to manage the volume of data. 

As we ushered in the era of big data, enterprises expected more value from data as advances in technology provided the capacity to gather, store, and analyze an exponential growth in both volumes and variety of data. With the ability to extract more (and timely) business insights than ever before, data has become a competitive advantage for enterprises that can extract actionable information from their diverse data sources and formats.

At the same time, increasing regulatory requirements have also necessitated ingesting data from diverse sources to make informed decisions. Regulatory authorities in California mandate collection, storage and analysis of data to reduce disruption caused by wildfires that take a huge economic toll on the community and businesses every year. For this, utility companies need to ingest, analyze and apply artificial intelligence or machine learning-based prediction techniques on voluminous data. This shift in the dynamics of data resulted in an exponential growth in terms of data volume, data sources, data exchange patterns, and data formats. 

Managing volume and complexity of data

Today, a significant amount of enterprise data is generated from external sources rather than internal systems of record (SORs). The type of stored data is transactional as well as engagement data. The engagement data can possibly be 10-20 times more volume than transactional data. Although big data technologies introduced distributed storage and accelerated data processing through massive parallel processing, they do not address dynamic scaling up of data acquisition, storage, and processing based on demand.

Elastic scaling of compute and storage on-premises is human-intensive, cumbersome, and expensive. Even data acquisition from multiple external sources increases overheads. Consequently, enterprises face several challenges with on-premises data management. It is difficult to: 

  1. Scale up data processing and storage for an exponential increase in polymorphic data
  2. Manage different mechanisms to ingest data from external and internal systems
  3. Ensure high availability of data and near-real time secure access to data insights 

Necessity is the mother of invention

The evolution of cloud computing coincided with an exponential growth in data. The cloud abstracted the problem of infinitely scaling storage and processing power on demand. It also provided a managed data landing zone for data ingestion from various internal and external systems. 

Amazon Web Services (AWS) offers a broad spectrum of highly available, fully managed data services catering to several types of data, be it relational, semi-structured, or unstructured. Amazon Relational Database Service (RDS) and Amazon Aurora cater to the relational domain, while Amazon DynamoDB is a NoSQL database service. 

AWS also provides managed services for other popular NoSQL compatible databases such as Amazon Document DB with MongoDB compatibility and Amazon Keyspaces for Apache Cassandra. Apart from these managed services, all leading NoSQL databases such as Couchbase, MongoDB and Cassandra have a managed database-as-a-service offering on AWS, and AWS also provides a platform where customers can use Amazon EC2 (Elastic Compute Cloud) to install and run these databases as independent software.

Navigating data migration, powered by AWS and Infosys migration strategy 

A sound data migration strategy is essential to ensure seamless operations and business continuity. In some cases, it may be beneficial to retain certain types of data on-premises due to regulatory requirements. The data migration approach may vary based on the size and nature of the data. 

For example, if the volume of data is huge, it is prudent to adopt AWS Snow Family, comprised of AWS Snowcone, AWS Snowball, and AWS Snowmobile. This suite of services offers a number of physical devices and capacity points to help physically transport up to exabytes of data into the AWS Cloud.

For data transformation, AWS provides Amazon Elastic Map Reduce (EMR), which manages Hadoop clusters in the cloud, and AWS Glue to manage ETL services. Furthermore, Amazon Athena and Amazon Redshift with spectrum provide data lakehouse implementation in cloud, and Amazon Quicksight adds a visualization layer for business users.

For continuous data ingestion from various resources in the AWS Cloud, AWS provides data migration and ingestion services that can be utilized — such as AWS Data Migration Service (DMS), which ingest relational data into AWS. Also, Amazon Kinesis services help to ingest, store and process streaming data.  

Post-migration, enterprises need to consider managing running costs. Implementing an observatory layer helps track and manage resource usage and optimization on the cloud. The metrics collected through AWS Cloud Trail, Cloud Watch and Billing metrics assist enterprises in creating and building this observatory layer. 

Infosys has worked with several global clients in migrating, modernizing, and building data platforms on cloud. We believe a platform-based approach to migrate applications and data to the cloud is imperative for a seamless migration. 

For example, we redesigned the data landscape of a device manufacturer to better manage almost a petabyte of data residing in on-premises network-attached storage (NAS). The data was growing by 300% year on year. The system allowed users to upload images, incident descriptions, and application logs related to device defects. The solution for data management system was designed using Amazon S3, Amazon EMR and AWS Glue Catalog for metadata management. Our choice was determined by several factors:

  1. Amazon Simple Storage Service S3 (Amazon S3) provides security, scalability, and a highly available object store for the petabyte-scale file storage on the NAS.
  2. Amazon S3 TransferManager helps manage large file uploads through multi-part uploads.
  3. Amazon S3 Transfer Accelerator enables data to be routed to the nearest edge location over an optimized network path for faster and more secure transfer of files.
  4. Amazon S3 provides a common and standard landing zone for data exchange between stakeholders.
  5. Amazon EMR and AWS Glue Catalog is a good fit to large volume ETL processing at scale and store metadata, which goes through frequent structural changes.

Migrating data and application workloads to the cloud are imperatives for enterprises to future-proof their businesses. A well-orchestrated, automated approach allows enterprises to realize the benefits from migrating data to the cloud.  

In order to lend predictability to the modernization, Infosys offers its customers the Infosys Modernization Suite and its component Infosys Database Migration Platform, which is part of Infosys Cobalt. This helps enterprises to migrate from on-premises RDBMs to cloud databases — such as AWS RDS, Amazon Aurora — or NoSQL databases such as Amazon DynamoDB and Amazon DocumentDB.

About the authors:
Naresh Duddu, AVP and Head, Cloud & Open Source, Modernization Practice, Infosys

Jignesh Desai is the AWS WW Migration Partner Solutions Architect for Infosys

Saurabh Shrivastava is the AWS Global SA Leader for Infosys 


Read More from This Article: Data Management on the Cloud Leveraging AWS
Source: News

Category: NewsApril 11, 2022
Tags: art

Post navigation

PreviousPrevious post:Shopping with Fraud Protection and Adaptive Artificial IntelligenceNextNext post:Broadcom Software Shows Why Adoption of AI-Driven Solutions is Accelerating in 2022

Related posts

Barb Wixom and MIT CISR on managing data like a product
May 30, 2025
Avery Dennison takes culture-first approach to AI transformation
May 30, 2025
The agentic AI assist Stanford University cancer care staff needed
May 30, 2025
Los desafíos de la era de la ‘IA en todas partes’, a fondo en Data & AI Summit 2025
May 30, 2025
“AI 비서가 팀 단위로 지원하는 효과”···퍼플렉시티, AI 프로젝트 10분 완성 도구 ‘랩스’ 출시
May 30, 2025
“ROI는 어디에?” AI 도입을 재고하게 만드는 실패 사례
May 30, 2025
Recent Posts
  • Barb Wixom and MIT CISR on managing data like a product
  • Avery Dennison takes culture-first approach to AI transformation
  • The agentic AI assist Stanford University cancer care staff needed
  • Los desafíos de la era de la ‘IA en todas partes’, a fondo en Data & AI Summit 2025
  • “AI 비서가 팀 단위로 지원하는 효과”···퍼플렉시티, AI 프로젝트 10분 완성 도구 ‘랩스’ 출시
Recent Comments
    Archives
    • May 2025
    • April 2025
    • March 2025
    • February 2025
    • January 2025
    • December 2024
    • November 2024
    • October 2024
    • September 2024
    • August 2024
    • July 2024
    • June 2024
    • May 2024
    • April 2024
    • March 2024
    • February 2024
    • January 2024
    • December 2023
    • November 2023
    • October 2023
    • September 2023
    • August 2023
    • July 2023
    • June 2023
    • May 2023
    • April 2023
    • March 2023
    • February 2023
    • January 2023
    • December 2022
    • November 2022
    • October 2022
    • September 2022
    • August 2022
    • July 2022
    • June 2022
    • May 2022
    • April 2022
    • March 2022
    • February 2022
    • January 2022
    • December 2021
    • November 2021
    • October 2021
    • September 2021
    • August 2021
    • July 2021
    • June 2021
    • May 2021
    • April 2021
    • March 2021
    • February 2021
    • January 2021
    • December 2020
    • November 2020
    • October 2020
    • September 2020
    • August 2020
    • July 2020
    • June 2020
    • May 2020
    • April 2020
    • January 2020
    • December 2019
    • November 2019
    • October 2019
    • September 2019
    • August 2019
    • July 2019
    • June 2019
    • May 2019
    • April 2019
    • March 2019
    • February 2019
    • January 2019
    • December 2018
    • November 2018
    • October 2018
    • September 2018
    • August 2018
    • July 2018
    • June 2018
    • May 2018
    • April 2018
    • March 2018
    • February 2018
    • January 2018
    • December 2017
    • November 2017
    • October 2017
    • September 2017
    • August 2017
    • July 2017
    • June 2017
    • May 2017
    • April 2017
    • March 2017
    • February 2017
    • January 2017
    Categories
    • News
    Meta
    • Log in
    • Entries feed
    • Comments feed
    • WordPress.org
    Tiatra LLC.

    Tiatra, LLC, based in the Washington, DC metropolitan area, proudly serves federal government agencies, organizations that work with the government and other commercial businesses and organizations. Tiatra specializes in a broad range of information technology (IT) development and management services incorporating solid engineering, attention to client needs, and meeting or exceeding any security parameters required. Our small yet innovative company is structured with a full complement of the necessary technical experts, working with hands-on management, to provide a high level of service and competitive pricing for your systems and engineering requirements.

    Find us on:

    FacebookTwitterLinkedin

    Submitclear

    Tiatra, LLC
    Copyright 2016. All rights reserved.