- Consulting
- Training
- Partners
- About Us
x
In this fast growing world, humongous amount of data is being produced from all sources in every part of the world. It can be anything like logs from the machines, data produced from the traffic signals, data from the IoT devices, smart devices installed in homes/IT industries and a lot of other sources. After production of this vast amount of data, another problem arises of storing, configuring, managing and streaming of data.
How to manage data which occupies storage, utilizes compute power, used for analysis is an important aspect for decision making?
AWS has a solution to it. Amazon Kinesis streams is the service that you are looking for to stream the data.
Kinesis Streams will collect data form the source and stream to application for further analysis. The data is replicated across availability zones for high availability and reliability of data. It can scale based on the incoming data. It can scale from megabytes to terabytes while streaming data. It loads data into stream using HTTPs, Kinesis Producer library, Kinesis Client Library and Kinesis Agent. Basically in Kinesis Streams the data is available up to 24 hours and can also be extended up till 7 days.
Kinesis Streams resolved the problem of analysis, compute power and decision making. But, we still have a problem of storing the data. Since Kinesis Streams can only save data up to 24Hrs initially and can be saved till 7 days.
What if we need to store the data for long???
What if we need to access the data afterwards for another set of task???
There comes Kinesis Firehose into picture, AWS introduced new service called Kinesis Firehose.
This is the easiest way of streaming data when compared to Kinesis Streams. It will take care of monitoring, scaling, data management and provides data security. This blog will take you through Kinesis Firehose in an out.
Kinesis firehose captures data from web app, sensors, mobile applications and various different sources and streams them into Amazon S3 and/or Amazon Redshift and/or Amazon Elasticsearch.
It load’s massive volume of streaming data into Amazon S3 and Amazon Redshift.
It is fully managed service, which automatically scales the stream based on data and no need of administration. It can also batch, compress and encrypt data before loading, minimizes the storage used at the destination and increase security.
It automatically loads data into S3 or Redshift and can also compress and encrypt data, which helps in decreasing the storage and increasing the security.
Delivery stream is a stream of data or collection of data records. Initially, Firehose creates the delivery stream and sends data to it, which will be stored either in S3 or Redshift.
You can create the delivery stream using Firehose console or CreateDeliveryStream API call.
Records are data blobs (blobs are binary data), which are sent by data producer. Each data blob should be maximum of 1000 KB to delivery stream. Data blobs are named as records.
Destination is data store where the data is delivered. Here, Amazon S3 and Redshift are destinations.
Features of Kinesis Firehose
Kinesis Firehose take care of infrastructure, storage, networking and also the configuration needed to load data to S3 and Redshift. There’s no need to worry about the provisioning, deployment and maintenance of hardware or software to manage the process.
Pay only for the amount of data transmitted through the service. There is no minimum fees or upfront commitments.
Kinesis Firehose buffers incoming stream for certain period or based on the amount of data buffered. If any one of the feature fulfills it will stream data to destination
Its provides high level of data security. Firehose also have an option to encrypt the data automatically before moving data to destination.
Configuration Management
Firehose buffers incoming stream before driving to destination for certain period of time. Buffer size is in MBs and Interval in seconds.
Choose Buffer size (1 – 128 MBs) and Buffer Interval (60 – 900 seconds) based on data delivery to Amazon S3.
Data Compression reduces the number of bits needed to store same amount of data. Three compression formats supported are GZIP, ZIP and SNAPPY or choose no data compression.
Choose Encrypt data or no encryption with a key from AWS Key Management Service.
Kinesis does not process or interrupt the raw data, you need to simply create a stream and writes data record to it.
Configure buffer and compression options.
In this blog, we saw what is Kinesis Firehose, what is the need of it, where to use Kinesis Firehose and how to configure it. In my next blog, we will see how we should use this firehose to get the analytics of an application logs.
Voiced by Amazon Polly |
CloudThat is a leading provider of cloud training and consulting services, empowering individuals and organizations to leverage the full potential of cloud computing. With a commitment to delivering cutting-edge expertise, CloudThat equips professionals with the skills needed to thrive in the digital era.
Our support doesn't end here. We have monthly newsletters, study guides, practice questions, and more to assist you in upgrading your cloud career. Subscribe to get them all!
Munazir
Jun 15, 2016
Nice info, Amazing
Click to Comment