The Applied AI and Natural Language Processing Workshop
上QQ阅读APP看书,第一时间看更新

Data Replication

Amazon replicates data across the region in multiple servers located in Amazon's data centers. Data replication benefits include high availability and durability. More specifically, when you create a new object in S3, the data is saved in S3; however, the change needs to be replicated across the S3 regions. Overall, replication may take some time, and you might notice delays resulting from various replication mechanisms.

After deleting an object, replication can cause a lag time that allows the deleted data to display until the deletion is fully replicated. Creating an object and immediately trying to display it in the object list might be delayed as a result of a replication delay.

The REST Interface

S3's native interface is a Representational State Transfer (REST) API. It is recommended to always use HTTPS requests to perform any S3 operations. The two higher-level interfaces that we will use to interact with S3 are the AWS Management Console and the AWS CLI. Accessing objects with the API is quite simple and includes the following operations for the entity in question:

  • Bucket: Create, delete, or list keys in a bucket
  • Object: Write, read, or delete

Exercise 1.01: Using the AWS Management Console to Create an S3 Bucket

In this exercise, we will prepare a place on AWS to store data for ML. To import a file, you need to have access to the Amazon S3 console:

  1. You should have already completed the account setup detailed earlier in this chapter. Go to https://aws.amazon.com/ and click My Account and then AWS Management Console to open the AWS Management Console in a new browser tab:

    Figure 1.6: Accessing the AWS Management Console via the user's account

  2. Click inside the search bar located under AWS services, as shown here:

    Figure 1.7: Searching AWS services

  3. Type S3 into the search bar and an auto-populated list will appear. Then, click the S3 Scalable Storage in the Cloud option:

    Figure 1.8: Selecting the S3 service

  4. Now we need to create an S3 bucket. In the S3 dashboard, click the Create bucket button. If this is the first time that you are creating a bucket, your screen will look like this:

    Figure 1.9: Creating a bucket

    If you have already created S3 buckets, your dashboard will list all the buckets you have created. Enter a unique bucket name: Bucket names must be unique across S3. If you encounter a naming issue, please refer to https://docs.aws.amazon.com/AmazonS3/latest/dev/BucketRestrictions.html.

    Region: If a default region is auto-populated, then keep the default location. If it is not auto populated, select a region near your current location.

  5. Click the Next button to continue the creation of the bucket:

    Figure 1.10: The Create bucket window

  6. An S3 bucket provides the property options Versioning, Server Access Logging, Tags, Object-Level Logging, and Default Encryption; however, we will not enable them.
  7. Your bucket will be displayed in the bucket list, as shown here:

Figure 1.11: The bucket has been created

In this exercise, we have created a place for our files to be stored on the cloud. In the next exercise, we will learn the process of storing and retrieving our files from this place.

Exercise 1.02: Importing and Exporting the File with Your S3 Bucket

In this exercise, we will show you how to place your data in S3 on Amazon, and how to retrieve it from there.

Follow these steps to complete this exercise:

Importing a file:

  1. Click the bucket's name to navigate to the bucket:

    Figure 1.12: Navigate to the bucket

  2. You are on the bucket's home page. Select Upload:

    Figure 1.13: Uploading a file into the bucket

  3. To select a file to upload, click Add files:

    Figure 1.14: Adding a new file to the bucket

  4. We will upload the pos_sentiment__leaves_of_grass.txt file from the https://packt.live/3e9lwfR GitHub repository. The best way is to download the repository to your local disk. Then you can select the file:

    Figure 1.15: Selecting the file to upload to the S3 bucket

  5. After selecting a file to upload, select Next:

    Figure 1.16: Selecting the file to upload to the bucket

  6. Click the Next button and leave the default options selected:

    Figure 1.17: The permissions page while uploading the file

  7. You can set property settings for your object, such as Storage class, Encryption, and Metadata. However, leave the default values as they are and then click the Next button:

    Figure 1.18: Setting the properties

  8. Click the Upload button to upload the files:

    Figure 1.19: Uploading the files

  9. You will be directed to your object on your bucket's home screen:

Figure 1.20: Files uploaded to the bucket

Exporting a file:

  1. Select the checkbox next to the file to export (Red Marker #1 – see the following screenshot). This populates the file's information display screen. Click Download (Red Marker #2 – see the following screenshot) to retrieve the text file:

Figure 1.21: Exporting the file

The file will download, as shown in the bottom left-hand corner of the screen:

Figure 1.22: Downloading the file to export

In this exercise, you learned how to import a file to and export a file from your Amazon S3 bucket. As you can see, the process is quite easy thanks to the simple user interface.