How to use the command 'aws kinesis' (with examples)

How to use the command 'aws kinesis' (with examples)

Amazon Kinesis is a powerful tool for processing and analyzing real-time streaming data. It manages and operates the complexities associated with data streams and scales transparently to handle massive amounts of data. The AWS Command Line Interface (CLI) offers a straightforward way to interact with the Kinesis service, enabling developers to perform a variety of tasks related to the creation, management, and utilization of data streams. The following examples demonstrate how to perform common operations using the aws kinesis command.

Use case 1: Show all streams in the account

Code:

aws kinesis list-streams

Motivation: Understanding which Kinesis streams are currently active and accessible in your AWS account is a fundamental aspect of managing streaming data applications. This command is particularly useful for administrators and developers who need to monitor available resources or audit and clean up unused streams.

Explanation:

  • aws kinesis: This specifies the AWS service you’re interacting with, namely Kinesis Data Streams.
  • list-streams: This is the action that lists all the data streams you have created in your AWS account.

Example Output:

{
    "StreamNames": [
        "example-stream-1",
        "example-stream-2",
        "example-stream-3"
    ]
}

Use case 2: Write one record to a Kinesis stream

Code:

aws kinesis put-record --stream-name example-stream --partition-key key1 --data dGVzdCBtZXNzYWdlIGJhc2U2NA==

Motivation: Writing records to a Kinesis stream is essential for feeding data into a stream for real-time processing. This command allows you to inject base64-encoded data directly into a specified Kinesis stream, ensuring that your system can start processing information as soon as it becomes available.

Explanation:

  • put-record: This is the action that writes a single data record into a Kinesis stream.
  • --stream-name example-stream: Specifies the stream to which the record should be written, by providing the stream’s unique name.
  • --partition-key key1: This is a key used to group records within the shard. It ensures that records with the same partition key are routed to the same shard.
  • --data dGVzdCBtZXNzYWdlIGJhc2U2NA==: This represents the base64-encoded data payload (here translated as “test message base64”).

Example Output:

{
    "ShardId": "shardId-000000000000",
    "SequenceNumber": "49640610668614886128711708916060667677544601327287074818"
}

Use case 3: Write a record to a Kinesis stream with inline base64 encoding

Code:

aws kinesis put-record --stream-name example-stream --partition-key key1 --data "$( echo "my raw message" | base64 )"

Motivation: An alternative approach to the previous method, here the base64 encoding is performed inline. This can simplify workflows by allowing you to write raw messages directly into your terminal, rather than preparing base64-encoded strings in advance.

Explanation:

  • --data "$( echo "my raw message" | base64 )": This sub-command pipes the string “my raw message” through a base64 encoding process to transform it into a suitable format for Kinesis.

Example Output:

{
    "ShardId": "shardId-000000000001",
    "SequenceNumber": "21249316948932777593284527714351939050655784386719800322"
}

Use case 4: List the shards available on a stream

Code:

aws kinesis list-shards --stream-name example-stream

Motivation: Shards are the units of capacity of a Kinesis stream. Listing the shards helps in analyzing the current distribution of processing units within your stream, enabling you to make informed decisions about scaling and resource allocation.

Explanation:

  • list-shards: This command retrieves a list of shards belonging to a Kinesis stream.
  • --stream-name example-stream: Indicates the specific stream whose shards you want to list.

Example Output:

{
    "Shards": [
        {
            "ShardId": "shardId-000000000000",
            "HashKeyRange": {"StartingHashKey": "0", "EndingHashKey": "340282366920938463463374607431768211455"},
            "SequenceNumberRange": {"StartingSequenceNumber": "49598603003000172226866489272653839642460807318186905602"}
        }
    ]
}

Use case 5: Get a shard iterator for reading from the oldest message in a stream’s shard

Code:

aws kinesis get-shard-iterator --shard-iterator-type TRIM_HORIZON --stream-name example-stream --shard-id shardId-000000000000

Motivation: To read data from a shard in a Kinesis stream, you first need a shard iterator. Retrieving a shard iterator with the TRIM_HORIZON type allows you to start reading from the oldest untrimmed record, providing access to the complete data history.

Explanation:

  • get-shard-iterator: This operation retrieves the iterator for a specific shard.
  • --shard-iterator-type TRIM_HORIZON: Specifies the iterator type. TRIM_HORIZON provides an iterator that will read from the oldest available record in the shard.
  • --stream-name example-stream: The name of the stream containing the shard.
  • --shard-id shardId-000000000000: Identifies the specific shard from which the iterator will read.

Example Output:

{
    "ShardIterator": "AAAAAAAAAAGhqcqBcV3DemoAX80vXY8xOzSCshRYEuvS8E1udRVTV0V9pmerMC6Dcul/data"
}

Use case 6: Read records from a shard using a shard iterator

Code:

aws kinesis get-records --shard-iterator AAAAAAAAAAGhqcqBcV3DemoAX80vXY8xOzSCshRYEuvS8E1udRVTV0V9pmerMC6Dcul/data

Motivation: Retrieving records from a Kinesis shard is at the core of processing streaming data in real-time. This command allows you to pull the data using an iterator, which you would have obtained from a previous command, and begin processing it according to the needs of your application.

Explanation:

  • get-records: This command requests data records from the specified shard via the provided shard iterator.
  • --shard-iterator AAAAAAAAAAGhqcqBcV3DemoAX80vXY8xOzSCshRYEuvS8E1udRVTV0V9pmerMC6Dcul/data: This provides the shard iterator obtained earlier, instructing Kinesis on where to start reading in the shard.

Example Output:

{
    "Records": [
        {
            "SequenceNumber": "49640610668614886128711708916060667677544601327287074818",
            "Data": "dGVzdCBtZXNzYWdlIGJhc2U2NA==",
            "PartitionKey": "key1"
        }
    ],
    "NextShardIterator": "AAAABhvT8WROXpknxG2HV8lpDjn1XMZrpQ3nyTMAURFpTfPK5UxoxofO54AzXhdR7I6ehggddgj"
}

Conclusion:

Using the AWS CLI to interact with Amazon Kinesis Streams provides a powerful and flexible way to manage real-time data processing workflows. By utilizing the commands highlighted above, developers can efficiently list streams, write and read data records, and manage shards. These functionalities enable seamless integration of real-time data processing into applications, further enhancing the utility of Amazon’s cloud services.

Related Posts

How to Utilize the Command 'ugrep' (with examples)

How to Utilize the Command 'ugrep' (with examples)

ugrep is an ultra-fast search tool designed to efficiently scan directories and files for patterns using a variety of options.

Read More
How to use the command 'jpegtopnm' (with examples)

How to use the command 'jpegtopnm' (with examples)

The jpegtopnm command is a versatile utility designed to convert JPEG/JFIF files into PPM (Portable Pixmap Format) or PGM (Portable Graymap Format) images.

Read More
How to Use the Command 'gpg' (with Examples)

How to Use the Command 'gpg' (with Examples)

GNU Privacy Guard (GPG) is a powerful cryptographic tool that allows users to encrypt and sign data and communications.

Read More