Skip to content

Option to disable hive partitioning wild cards #232

@niydt

Description

@niydt

The avro files we are trying to load into RedShift are stored in folders with "=" in their names, i.e.

    event_type=users.behaviors.app.FirstSession/. 

When loading data from the following S3 prefix,

com.hoopladigital.brazecurrentsstaging/StagingCurrentFull/dataexport.prod-03.S3.integration.60d3692fcab9ca5f83919aab/event_type%3Dusers.behaviors.app.FirstSession

The lambda failed with this error:

            error: No Configuration Found for com.hoopladigital.brazecurrentsstaging/StagingCurrentFull/dataexport.prod-03.S3.integration.60d3692fcab9ca5f83919aab/event_type=*/date=*/399/prod-03

As shown in the error message above, the"event_type=/date=" portion of the error message was transformed assuming that we are taking advantage of the hive partitioning wildcards (https://github.com/awslabs/aws-lambda-redshift-loader#hive-partitioning-style-wildcards) and replaces the event_type value with *.

We don't want to use this feature- I need the lambda to use the exact folder name that I provided in the prefix. Is there a way for me to configure the lambda to not use hive partitioning wild cards?

line 1584 of index.js:
inputInfo.prefix = inputInfo.bucket + '/' + searchKey.transformHiveStylePrefix();

line 78 of index.js
transformHiveStylePrefix()

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions