Bladeren bron

doc(table): documentation for table and file source

ngjaying 4 jaren geleden
bovenliggende
commit
1b109009d2
4 gewijzigde bestanden met toevoegingen van 132 en 24 verwijderingen
  1. 26 0
      docs/en_US/rules/sources/file.md
  2. 26 24
      docs/en_US/sqls/tables.md
  3. 26 0
      docs/zh_CN/rules/sources/file.md
  4. 54 0
      docs/zh_CN/sqls/tables.md

+ 26 - 0
docs/en_US/rules/sources/file.md

@@ -0,0 +1,26 @@
+## File source
+
+Kuiper provides built-in support for reading file content into the Kuiper processing pipeline. The file source is usually used as a [table](../../sqls/tables.md) and it is the default type for create table statement.
+
+```sql
+CREATE TABLE table1 (
+    name STRING,
+    size BIGINT,
+    id BIGINT
+) WITH (DATASOURCE="lookup.json", FORMAT="json", TYPE="file");
+```
+
+
+The configure file for the file source is in */etc/sources/file.yaml* in which the path to the file can be specified.
+
+```yaml
+default:
+  fileType: json
+  # The directory of the file relative to kuiper root or an absolute path.
+  # Do not include the file name here. The file name should be defined in the stream data source
+  path: data
+  # The interval between reading the files, time unit is ms. If only read once, set it to 0
+  interval: 0
+```
+
+With this yaml file, the table will refer to the file *${kuiper}/data/lookup.json* and read it in json format.

+ 26 - 24
docs/en_US/sqls/tables.md

@@ -1,6 +1,8 @@
 # Table specs
 # Table specs
 
 
-Kuiper streams are infinite.  **Table** is provided to read data from a finite source like a file or a normal database table as a batch. The batch source is supposed to be small because it will reside in the memory. The typical scenario to use table is to treat it as a static lookup dictionary to join with the stream.
+Kuiper streams is unbounded and immutable, any new data are appended in the current stream for processing.  **Table** is provided to represent the current state of the stream. It can be considered as a snapshot of the stream. Users can use table to retain a batch of data for processing.
+
+Table is not allowed to use alone in Kuiper. It is only recommended to join with streams. When joining with stream, table will be updated continuously when new event coming. However, only events arriving on the stream side trigger downstream updates and produce join output.
 
 
 ## Syntax
 ## Syntax
 
 
@@ -13,40 +15,40 @@ CREATE TABLE
     WITH ( property_name = expression [, ...] );
     WITH ( property_name = expression [, ...] );
 ```
 ```
 
 
-Table supports the same [data types](./streams.md#data-types) as stream. Compared to stream, it has the following limitations:
+Table supports the same [data types](./streams.md#data-types) as stream. 
+
+Table also supports all [the properties of the stream](./streams.md#language-definitions). Thus, all the source type are also supported in table. Many sources are not batched which have one event at any given time point, which means the table will always have only one event. An additional property `RETAIN_SIZE` to specify the size of the table snapshot so that the table can hold an arbitrary amount of history data.
 
 
-1. Currently, the only and default supported type is "file", and the source plugin is not supported.
-2. Format "binary" is not supported.
+## Usage scenarios
 
 
-## File type
+Typically, table will be joined with stream with or without a window. When joining with stream, table data won't affect the downstream updata, it is treated like a static referenced data although it may be updated internally.
 
 
-Currently, the only supported type for table is file. To create a table that will read lookup.json file is like:
+### Lookup table
+
+A typical usage for table is as a lookup table. Sample SQL will be like:
 
 
 ```sql
 ```sql
 CREATE TABLE table1 (
 CREATE TABLE table1 (
-    name STRING,
-    size BIGINT,
-    id BIGINT
-) WITH (DATASOURCE="lookup.json", FORMAT="json");
-```
-The configure file for the file source is in */etc/sources/file.yaml* in which the path to the file can be specified.
-
-```yaml
-default:
-  fileType: json
-  # The directory of the file relative to kuiper root or an absolute path.
-  # Do not include the file name here. The file name should be defined in the stream data source
-  path: data
+		id BIGINT,
+		name STRING
+	) WITH (DATASOURCE="lookup.json", FORMAT="JSON", TYPE="file");
+
+SELECT * FROM demo INNER JOIN table1 on demo.id = table1.id
 ```
 ```
 
 
-With this yaml file, the table will refer to the file *${kuiper}/data/lookup.json* and read it in json format.
+In this example, a table `table1` is created to read json data from file "lookup.json". Then in the rule, `table1` is joined with the stream `demo` so that the stream can lookup the name from the id.
 
 
-## Lookup table
+### Filter by history state
 
 
-A typical usage for table is as a lookup table. Sample SQL will be like:
+In some scenario, we may have an event stream for data and another event stream as the control information. 
 
 
 ```sql
 ```sql
-SELECT * FROM demo INNER JOIN table1 on demo.ts = table1.id
+CREATE TABLE stateTable (
+		id BIGINT,
+		triggered bool
+	) WITH (DATASOURCE="myTopic", FORMAT="JSON", TYPE="mqtt");
+
+SELECT * FROM demo LEFT JOIN stateTable WHERE triggered=true
 ```
 ```
 
 
-Only when joining with a table, the join statement can be run without a window.
+In this example, a table `stateTable` is created to record the trigger state from mqtt topic *myTopic*. In the rule, the data of `demo` stream is filtered with the current trigger state.

+ 26 - 0
docs/zh_CN/rules/sources/file.md

@@ -0,0 +1,26 @@
+## File source
+
+Kuiper provides built-in support for reading file content into the Kuiper processing pipeline. The file source is usually used as a [table](../../sqls/tables.md) and it is the default type for create table statement.
+
+```sql
+CREATE TABLE table1 (
+    name STRING,
+    size BIGINT,
+    id BIGINT
+) WITH (DATASOURCE="lookup.json", FORMAT="json", TYPE="file");
+```
+
+
+The configure file for the file source is in */etc/sources/file.yaml* in which the path to the file can be specified.
+
+```yaml
+default:
+  fileType: json
+  # The directory of the file relative to kuiper root or an absolute path.
+  # Do not include the file name here. The file name should be defined in the stream data source
+  path: data
+  # The interval between reading the files, time unit is ms. If only read once, set it to 0
+  interval: 0
+```
+
+With this yaml file, the table will refer to the file *${kuiper}/data/lookup.json* and read it in json format.

+ 54 - 0
docs/zh_CN/sqls/tables.md

@@ -0,0 +1,54 @@
+# Table specs
+
+Kuiper streams is unbounded and immutable, any new data are appended in the current stream for processing.  **Table** is provided to represent the current state of the stream. It can be considered as a snapshot of the stream. Users can use table to retain a batch of data for processing.
+
+Table is not allowed to use alone in Kuiper. It is only recommended to join with streams. When joining with stream, table will be updated continuously when new event coming. However, only events arriving on the stream side trigger downstream updates and produce join output.
+
+## Syntax
+
+Table supports almost the same syntax as streams. To create a table, run the below SQL:
+
+```sql
+CREATE TABLE   
+    table_name   
+    ( column_name <data_type> [ ,...n ] )
+    WITH ( property_name = expression [, ...] );
+```
+
+Table supports the same [data types](./streams.md#data-types) as stream. 
+
+Table also supports all [the properties of the stream](./streams.md#language-definitions). Thus, all the source type are also supported in table. Many sources are not batched which have one event at any given time point, which means the table will always have only one event. An additional property `RETAIN_SIZE` to specify the size of the table snapshot so that the table can hold an arbitrary amount of history data.
+
+## Usage scenarios
+
+Typically, table will be joined with stream with or without a window. When joining with stream, table data won't affect the downstream updata, it is treated like a static referenced data although it may be updated internally.
+
+### Lookup table
+
+A typical usage for table is as a lookup table. Sample SQL will be like:
+
+```sql
+CREATE TABLE table1 (
+		id BIGINT,
+		name STRING
+	) WITH (DATASOURCE="lookup.json", FORMAT="JSON", TYPE="file");
+
+SELECT * FROM demo INNER JOIN table1 on demo.id = table1.id
+```
+
+In this example, a table `table1` is created to read json data from file "lookup.json". Then in the rule, `table1` is joined with the stream `demo` so that the stream can lookup the name from the id.
+
+### Filter by history state
+
+In some scenario, we may have an event stream for data and another event stream as the control information. 
+
+```sql
+CREATE TABLE stateTable (
+		id BIGINT,
+		triggered bool
+	) WITH (DATASOURCE="myTopic", FORMAT="JSON", TYPE="mqtt");
+
+SELECT * FROM demo LEFT JOIN stateTable WHERE triggered=true
+```
+
+In this example, a table `stateTable` is created to record the trigger state from mqtt topic *myTopic*. In the rule, the data of `demo` stream is filtered with the current trigger state.