|
| 1 | +======================================================= |
| 2 | +redshift_connector |
| 3 | +======================================================= |
| 4 | + |
| 5 | +redshift_connector is the Amazon Web Services (AWS) Redshift connector for |
| 6 | +Python. Easy integration with `pandas <https://github.com/pandas-dev/pandas>`_ and `numpy <https://github.com/numpy/numpy>`_, as well as support for numerous Amazon Redshift specific features help you get the most out of your data |
| 7 | + |
| 8 | +Supported Amazon Redshift features include: |
| 9 | + |
| 10 | +- IAM authentication |
| 11 | +- Identity provider (IdP) authentication |
| 12 | +- Redshift specific data types |
| 13 | + |
| 14 | + |
| 15 | +This pure Python connector implements Python Database API Specification 2.0. |
| 16 | + |
| 17 | + |
| 18 | +Getting Started |
| 19 | +--------------- |
| 20 | +The easiest way to get started with redshift_connector is via `pip <https://pip.pypa.io/en/stable/>`_ |
| 21 | + |
| 22 | +``pip install redshift_connector`` |
| 23 | + |
| 24 | +Note: redshift_connector requires Python >= 3.5 |
| 25 | + |
| 26 | + |
| 27 | +You can install from source by cloning this repository. Assuming that you have Python and ``virtualenv`` installed, set up your environment and install the required dependencies like this: |
| 28 | + |
| 29 | +.. code-block:: sh |
| 30 | +
|
| 31 | + $ git clone https://github.com/aws/amazon-redshift-python-driver.git |
| 32 | + $ cd redshift_connector |
| 33 | + $ virtualenv venv |
| 34 | + $ . venv/bin/activate |
| 35 | + $ python -m pip install -r requirements.txt |
| 36 | + $ python -m pip install -e . |
| 37 | + $ python -m pip install redshift_connector |
| 38 | +
|
| 39 | +Basic Example |
| 40 | +~~~~~~~~~~~~~ |
| 41 | +.. code-block:: python |
| 42 | +
|
| 43 | + import redshift_connector |
| 44 | +
|
| 45 | + # Connects to Redshift cluster using AWS credentials |
| 46 | + conn = redshift_connector.connect( |
| 47 | + host='examplecluster.abc123xyz789.us-west-1.redshift.amazonaws.com', |
| 48 | + port=5439, |
| 49 | + database='dev', |
| 50 | + user='awsuser', |
| 51 | + password='my_password' |
| 52 | + ) |
| 53 | +
|
| 54 | + cursor: redshift_connector.Cursor = conn.cursor() |
| 55 | + cursor.execute("create Temp table book(bookname varchar,author varchar)") |
| 56 | + cursor.executemany("insert into book (bookname, author) values (%s, %s)", |
| 57 | + [ |
| 58 | + ('One Hundred Years of Solitude', 'Gabriel García Márquez'), |
| 59 | + ('A Brief History of Time', 'Stephen Hawking') |
| 60 | + ] |
| 61 | + ) |
| 62 | + cursor.execute("select * from book") |
| 63 | +
|
| 64 | + result: tuple = cursor.fetchall() |
| 65 | + print(result) |
| 66 | + >> (['One Hundred Years of Solitude', 'Gabriel García Márquez'], ['A Brief History of Time', 'Stephen Hawking']) |
| 67 | +
|
| 68 | +
|
| 69 | +Integration with pandas |
| 70 | +~~~~~~~~~~~~~~~~~~~~~~~ |
| 71 | +.. code-block:: python |
| 72 | +
|
| 73 | + import pandas |
| 74 | + cursor.execute("create Temp table book(bookname varchar,author varchar)") |
| 75 | + cursor.executemany("insert into book (bookname, author) values (%s, %s)", |
| 76 | + [ |
| 77 | + ('One Hundred Years of Solitude', 'Gabriel García Márquez'), |
| 78 | + ('A Brief History of Time', 'Stephen Hawking') |
| 79 | +
|
| 80 | + ]) |
| 81 | + cursor.execute("select * from book") |
| 82 | + result: pandas.DataFrame = cursor.fetch_dataframe() |
| 83 | + print(result) |
| 84 | + >> bookname author |
| 85 | + >> 0 One Hundred Years of Solitude Gabriel García Márquez |
| 86 | + >> 1 A Brief History of Time Stephen Hawking |
| 87 | +
|
| 88 | +
|
| 89 | +Integration with numpy |
| 90 | +~~~~~~~~~~~~~~~~~~~~~~ |
| 91 | + |
| 92 | +.. code-block:: python |
| 93 | +
|
| 94 | + import numpy |
| 95 | + cursor.execute("select * from book") |
| 96 | +
|
| 97 | + result: numpy.ndarray = cursor.fetch_numpy_array() |
| 98 | + print(result) |
| 99 | + >> [['One Hundred Years of Solitude' 'Gabriel García Márquez'] |
| 100 | + >> ['A Brief History of Time' 'Stephen Hawking']] |
| 101 | +
|
| 102 | +Query using functions |
| 103 | +~~~~~~~~~~~~~~~~~~~~~ |
| 104 | +.. code-block:: python |
| 105 | +
|
| 106 | + cursor.execute("SELECT CURRENT_TIMESTAMP") |
| 107 | + print(cursor.fetchone()) |
| 108 | + >> [datetime.datetime(2020, 10, 26, 23, 3, 54, 756497, tzinfo=datetime.timezone.utc)] |
| 109 | +
|
| 110 | +
|
| 111 | +Connection Parameters |
| 112 | +~~~~~~~~~~~~~~~~~~~~~ |
| 113 | ++-------------------------+--------------------------------------------------------------------------------------------+---------------+----------+ |
| 114 | +| Name | Description | Default Value | Required | |
| 115 | ++=========================+============================================================================================+===============+==========+ |
| 116 | +| database | String. The name of the database to connect to | | Yes | |
| 117 | ++-------------------------+--------------------------------------------------------------------------------------------+---------------+----------+ |
| 118 | +| user | String. The username to use for authentication | | Yes | |
| 119 | ++-------------------------+--------------------------------------------------------------------------------------------+---------------+----------+ |
| 120 | +| password | String. The password to use for authentication | | Yes | |
| 121 | ++-------------------------+--------------------------------------------------------------------------------------------+---------------+----------+ |
| 122 | +| host | String. The hostname of Amazon Redshift cluster | | Yes | |
| 123 | ++-------------------------+--------------------------------------------------------------------------------------------+---------------+----------+ |
| 124 | +| port | Int. The port number of the Amazon Redshift cluster | 5439 | No | |
| 125 | ++-------------------------+--------------------------------------------------------------------------------------------+---------------+----------+ |
| 126 | +| ssl | Bool. If SSL is enabled | True | No | |
| 127 | ++-------------------------+--------------------------------------------------------------------------------------------+---------------+----------+ |
| 128 | +| iam | Bool. If IAM Authentication is enabled | False | No | |
| 129 | ++-------------------------+--------------------------------------------------------------------------------------------+---------------+----------+ |
| 130 | +| sslmode | String. The security of the connection to Amazon Redshift. | 'verify-ca' | No | |
| 131 | +| | 'verify-ca' and 'verify-full' are supported. | | | |
| 132 | ++-------------------------+--------------------------------------------------------------------------------------------+---------------+----------+ |
| 133 | +| idp_response_timeout | Int. The timeout for retrieving SAML assertion from IdP | 120 | No | |
| 134 | ++-------------------------+--------------------------------------------------------------------------------------------+---------------+----------+ |
| 135 | +| idp_port | Int. The listen port IdP will send the SAML assertion to | 7890 | No | |
| 136 | ++-------------------------+--------------------------------------------------------------------------------------------+---------------+----------+ |
| 137 | +| log_level | Int. The level of logging enabled, increasing in granularity (values [0,4] are valid) | 0 | No | |
| 138 | ++-------------------------+--------------------------------------------------------------------------------------------+---------------+----------+ |
| 139 | +| log_path | String. The file path to the log file | 'driver.log' | No | |
| 140 | ++-------------------------+--------------------------------------------------------------------------------------------+---------------+----------+ |
| 141 | +| max_prepared_statements | Int. The maximum number of prepared statements that can be open at once | 1000 | No | |
| 142 | ++-------------------------+--------------------------------------------------------------------------------------------+---------------+----------+ |
| 143 | +| idp_tenant | String. The IdP tenant | None | No | |
| 144 | ++-------------------------+--------------------------------------------------------------------------------------------+---------------+----------+ |
| 145 | +| credential_provider | String. The IdP that will be used for authenticating with Amazon Redshift. | None | No | |
| 146 | +| | 'OktaCredentialsProvider', 'AzureCredentialsProvider', 'BrowserAzureCredentialsProvider', | | | |
| 147 | +| | 'PingCredentialsProvider', 'BrowserSamlCredentialsProvider', and 'AdfsCredentialsProvider' | | | |
| 148 | +| | are supported. | | | |
| 149 | ++-------------------------+--------------------------------------------------------------------------------------------+---------------+----------+ |
| 150 | +| cluster_identifier | String. The cluster identifier of the Amazon Redshift Cluster | None | No | |
| 151 | ++-------------------------+--------------------------------------------------------------------------------------------+---------------+----------+ |
| 152 | +| db_user | String. The user ID to use with Amazon Redshift | None | No | |
| 153 | ++-------------------------+--------------------------------------------------------------------------------------------+---------------+----------+ |
| 154 | +| login_url | String. The SSO Url for the IdP | None | No | |
| 155 | ++-------------------------+--------------------------------------------------------------------------------------------+---------------+----------+ |
| 156 | +| preferred_role | String. The IAM role preferred for the current connection | None | No | |
| 157 | ++-------------------------+--------------------------------------------------------------------------------------------+---------------+----------+ |
| 158 | +| client_secret | String. The client secret from Azure IdP | None | No | |
| 159 | ++-------------------------+--------------------------------------------------------------------------------------------+---------------+----------+ |
| 160 | +| client_id | String. The client id from Azure IdP | None | No | |
| 161 | ++-------------------------+--------------------------------------------------------------------------------------------+---------------+----------+ |
| 162 | +| region | String. The AWS region where the cluster is located | None | No | |
| 163 | ++-------------------------+--------------------------------------------------------------------------------------------+---------------+----------+ |
| 164 | +| app_name | String. The name of the IdP application used for authentication. | None | No | |
| 165 | ++-------------------------+--------------------------------------------------------------------------------------------+---------------+----------+ |
| 166 | + |
| 167 | + |
| 168 | +Getting Help |
| 169 | +~~~~~~~~~~~~ |
| 170 | +- Ask a question on `Stack Overflow <https://stackoverflow.com/>`_ and tag it with redshift_connector |
| 171 | +- Open a support ticket with `AWS Support <https://console.aws.amazon.com/support/home#/>`_ |
| 172 | +- If you may have found a bug, please `open an issue <https://github.com/aws/amazon-redshift-python-driver/issues/new>`_ |
| 173 | + |
| 174 | +Contributing |
| 175 | +~~~~~~~~~~~~ |
| 176 | +We look forward to collaborating with you! Please read through `CONTRIBUTING <https://github.com/aws/amazon-redshift-python-driver/blob/master/CONTRIBUTING.md#Reporting-Bugs/Feature-Requests>`_ before submitting any issues or pull requests. |
| 177 | + |
| 178 | +Running Tests |
| 179 | +------------- |
| 180 | +You can run tests by using ``pytest test/unit``. This will run all unit tests. Integration tests require providing credentials for an Amazon Redshift cluster as well as IdP attributes in ``test/config.ini``. |
| 181 | + |
| 182 | +Additional Resources |
| 183 | +~~~~~~~~~~~~~~~~~~~~ |
| 184 | +- `LICENSE <https://github.com/aws/amazon-redshift-python-driver/blob/master/LICENSE>`_ |
0 commit comments