url
url
function creates a table from the URL
with given format
and structure
.
url
function may be used in SELECT
and INSERT
queries on data in URL tables.
Syntax
url(URL [,format] [,structure] [,headers])
Parameters
URL
— HTTP or HTTPS server address, which can acceptGET
orPOST
requests (forSELECT
orINSERT
queries correspondingly). Type: String.format
— Format of the data. Type: String.structure
— Table structure in'UserID UInt64, Name String'
format. Determines column names and types. Type: String.headers
- Headers in'headers('key1'='value1', 'key2'='value2')'
format. You can set headers for HTTP call.
Returned value
A table with the specified format and structure and with data from the defined URL
.
Examples
Getting the first 3 lines of a table that contains columns of String
and UInt32 type from HTTP-server which answers in CSV format.
SELECT * FROM url('http://127.0.0.1:12345/', CSV, 'column1 String, column2 UInt32', headers('Accept'='text/csv; charset=utf-8')) LIMIT 3;
Inserting data from a URL
into a table:
CREATE TABLE test_table (column1 String, column2 UInt32) ENGINE=Memory;
INSERT INTO FUNCTION url('http://127.0.0.1:8123/?query=INSERT+INTO+test_table+FORMAT+CSV', 'CSV', 'column1 String, column2 UInt32') VALUES ('http interface', 42);
SELECT * FROM test_table;
Globs in URL
Patterns in curly brackets { }
are used to generate a set of shards or to specify failover addresses. Supported pattern types and examples see in the description of the remote function.
Character |
inside patterns is used to specify failover addresses. They are iterated in the same order as listed in the pattern. The number of generated addresses is limited by glob_expansion_max_elements setting.
Virtual Columns
_path
— Path to theURL
. Type:LowCardinalty(String)
._file
— Resource name of theURL
. Type:LowCardinalty(String)
._size
— Size of the resource in bytes. Type:Nullable(UInt64)
. If the size is unknown, the value isNULL
._time
— Last modified time of the file. Type:Nullable(DateTime)
. If the time is unknown, the value isNULL
._headers
- HTTP response headers. Type:Map(LowCardinality(String), LowCardinality(String))
.
Hive-style partitioning
When setting use_hive_partitioning
is set to 1, ClickHouse will detect Hive-style partitioning in the path (/name=value/
) and will allow to use partition columns as virtual columns in the query. These virtual columns will have the same names as in the partitioned path, but starting with _
.
Example
Use virtual column, created with Hive-style partitioning
SET use_hive_partitioning = 1;
SELECT * from url('http://data/path/date=*/country=*/code=*/*.parquet') where _date > '2020-01-01' and _country = 'Netherlands' and _code = 42;
Storage Settings
- engine_url_skip_empty_files - allows to skip empty files while reading. Disabled by default.
- enable_url_encoding - allows to enable/disable decoding/encoding path in uri. Enabled by default.
See Also