Get CSV object in S3 from Lambda(node.js) with SelectObjectContentCommand – AWS SDK for JavaScript v3

Get CSV object in S3 from Lambda(node.js) with SelectObjectContentCommand – AWS SDK for JavaScript v3

It seems that s3 select can be used in Lambda (node.js). s3 select enables processing of csv data at the time of acquisition.

Introducing support for Amazon S3 Select in the AWS SDK for JavaScript | Amazon Web Services
We’re excited to announce support for the Amazon Simple Storage Service (Amazon S3) selectObjectContent API with event s...

The CSV placed in S3 should be a CSV of data from the first line as shown below.

項目
character code utf-8
newline character CRLF
csv

2行3列のcsv

Install the npm module.

npm i @aws-sdk/client-s3 csv-parse

Lambda(node.js)

Get s3 CSV object, s3 select, and make it into an array in the csv module.

import {S3Client, SelectObjectContentCommand} from '@aws-sdk/client-s3'
import {parse} from 'csv-parse/sync'
export async function handler(event, context) {
  const input = {
    Bucket: 'bucket name',
    Key: 'tmp/testdata.csv',
    ExpressionType: 'SQL',
    Expression: 'SELECT s._1,s._2 FROM S3Object s', // SQL to get only columns 1 and 2
    InputSerialization: {
      CSV: {
        FileHeaderInfo: 'NONE', // headline-less
        RecordDelimiter: '\r\n',
        FieldDelimiter: ',' // comma-separated
      }
    },
    OutputSerialization: {
      CSV: {}
    }
  }
  const client = new S3Client({
    region: 'ap-northeast-1'
  })
  const command = new SelectObjectContentCommand(input)
  const data = await client.send(command)
  const csv = parse(await streamToCsv(data.Payload))
  console.log(csv) // [ [ 'aaa', 'bbb' ], [ 'ddd', 'eee' ] ]
  
  const response = {
    statusCode: 200,
    body: JSON.stringify('Hello from Lambda!'),
  }
  return response
}

async function streamToCsv(generator) {
  const chunks = []
  for await (const value of generator) {
    if (value.Records) {
      chunks.push(value.Records.Payload)
    }
  }
  return Buffer.concat(chunks).toString('utf8')
}

It is troublesome because the Payload is returned in SelectObjectContentEventStream.

InvalidTextEncoding: UTF-8 encoding is required.

It seems that only UTF-8 files are supported, and SJIS will generate an “InvalidTextEncoding: UTF-8 encoding is required.

参考サイト

SelectObjectContentCommand | @aws-sdk/client-s3
Documentation for @aws-sdk/client-s3
CSVInput | S3 Client - AWS SDK for JavaScript v3
Documentation for S3 Client - AWS SDK for JavaScript v3
CSV package for Node.js version 6
Version 6 of the package for Node.js is released along its sub projects. Here are the latest versions: version , latest ...
AWS S3 Select and Node.Js
How to use AWS S3 Select and Node.Js

コメント

Discover more from 株式会社CONFRAGE ITソリューション事業部

Subscribe now to keep reading and get access to the full archive.

Continue reading

Copied title and URL