2 years atrás · 9cdad5c65d
--- a/docs/en_US/guide/sources/builtin/file.md
+++ b/docs/en_US/guide/sources/builtin/file.md
@@ -4,7 +4,7 @@
 
				 <span style="background:green;color:white">scan table source</span>
			
 
				 
			
 
				 eKuiper provides built-in support for reading file content into the eKuiper processing pipeline. The file source is
			
 
				-usually used as a [table](../../../sqls/tables.md) and it is the default type for create table statement. File sources
			
 
				+usually used as a [table](../../../sqls/tables.md), and it is the default type for create table statement. File sources
			
 
				 are also supported as streams, where it is usually necessary to set the `interval` parameter to pull updates at regular
			
 
				 intervals.
			
 
				 
			
@@ -48,4 +48,75 @@ default:
 
				   ignoreEndLines: 0
			
 
				 ```
			
 
				 
			
 
				-With this yaml file, the table will refer to the file *${eKuiper}/data/lookup.json* and read it in json format.
			
 
				+### File Types
			
 
				+
			
 
				+The file source supports monitoring files or folders. If the monitored location is a folder, all files in the folder are required to be of the same type. When monitoring a folder, it will read in files order by file name alphabetically.
			
 
				+
			
 
				+The supported file types are
			
 
				+
			
 
				+- json: standard JSON array format files,
			
 
				+  see [example](https://github.com/lf-edge/ekuiper/tree/master/internal/topo/source/test/test.json). If the file format is a line-separated JSON string, it needs to be defined in lines format.
			
 
				+- csv: comma-separated csv files are supported, as well as custom separators.
			
 
				+- lines: line-separated file. The decoding method of each line can be defined by the format parameter in the stream definition. For example, for a line-separated JSON string, the file type is set to lines and the format is set to json.
			
 
				+
			
 
				+Some files may have most of the data in standard format, but have some metadata in the opening and closing lines of the file. The user can use the `ignoreStartLines` and `ignoreEndLines` arguments to remove the non-standard parts of the beginning and end so that the above file types can be parsed.
			
 
				+
			
 
				+### Example
			
 
				+
			
 
				+File sources involve the parsing of file contents and intersect with format-related definitions in data streams. We
			
 
				+describe with some examples how to combine file types and formats for parsing file sources.
			
 
				+
			
 
				+#### Read a csv with a custom separator
			
 
				+
			
 
				+The standard csv separator is a comma, but there are a large number of files that use the csv-like format with custom
			
 
				+separators. Some csv-like files have column names defined in the first line instead of data.
			
 
				+
			
 
				+```csv
			
 
				+id name age
			
 
				+1 John 56
			
 
				+2 Jane 34
			
 
				+```
			
 
				+
			
 
				+When the file is read, the configuration file is as follows, specifying that the file has a header.
			
 
				+
			
 
				+```yaml
			
 
				+csv:
			
 
				+  fileType: csv
			
 
				+  hasHeader: true
			
 
				+```
			
 
				+
			
 
				+In the stream definition, set the stream data to ``DELIMITED`` format, specifying the separator with the ``DELIMITER``
			
 
				+parameter.
			
 
				+
			
 
				+```SQL
			
 
				+create
			
 
				+stream cscFileDemo () WITH (FORMAT="DELIMITED", DATASOURCE="abc.csv", TYPE="file", DELIMITER=" ", CONF_KEY="csv"
			
 
				+```
			
 
				+
			
 
				+#### Read multi-line JSON data
			
 
				+
			
 
				+With a standard JSON file, the entire file should be a JSON object or an array. In practice, we often need to parse
			
 
				+files that contain multiple JSON objects. These files are not actually JSON themselves, but are considered to be
			
 
				+multiple lines of JSON data, assuming that each JSON object is a single line.
			
 
				+
			
 
				+```text
			
 
				+{"id": 1, "name": "John Doe"}
			
 
				+{"id": 2, "name": "Jane Doe"}
			
 
				+{"id": 3, "name": "John Smith"}
			
 
				+```
			
 
				+
			
 
				+When reading this file, the configuration file is as follows, specifying the file type as lines.
			
 
				+
			
 
				+```yaml
			
 
				+jsonlines:
			
 
				+  fileType: lines
			
 
				+```
			
 
				+
			
 
				+In the stream definition, set the stream data to be in `JSON` format.
			
 
				+
			
 
				+```SQL
			
 
				+create stream linesFileDemo () WITH (FORMAT="JSON", TYPE="file", CONF_KEY="jsonlines"
			
 
				+```
			
 
				+
			
 
				+Moreover, the lines file type can be combined with any format. For example, if you set the format to protobuf and
			
 
				+configure the schema, it can be used to parse data that contains multiple Protobuf encoded lines.
			
--- a/docs/zh_CN/guide/sources/builtin/file.md
+++ b/docs/zh_CN/guide/sources/builtin/file.md
@@ -46,34 +46,25 @@ default:
 
				   ignoreEndLines: 0
			
 
				 ```
			
 
				 
			
 
				-### File Types
			
 
				+### 文件源
			
 
				 
			
 
				-The file source supports monitoring files or folders. If the monitored location is a folder, all files in the folder are
			
 
				-required to be of the same type. When monitoring a folder, it will read in files order by file name alphabetically.
			
 
				+文件源支持监控文件或文件夹。如果被监控的位置是一个文件夹，那么该文件夹中的所有文件必须是同一类型。当监测一个文件夹时，它将按照文件名的字母顺序来读取文件。
			
 
				 
			
 
				-The supported file types are
			
 
				+支持的文件类型有：
			
 
				 
			
 
				-- json: standard JSON array format files,
			
 
				-  see [example](https://github.com/lf-edge/ekuiper/tree/master/internal/topo/source/test/test.json). If the file format
			
 
				-  is a line-separated JSON string, it needs to be defined in lines format.
			
 
				-- csv: comma-separated csv files are supported, as well as custom separators.
			
 
				-- lines: line-separated file. The decoding method of each line can be defined by the format parameter in the stream
			
 
				-  definition. For example, for a line-separated JSON string, the file type is set to lines and the format is set to
			
 
				-  json.
			
 
				+- json：标准的JSON数组格式文件。见[例子](https://github.com/lf-edge/ekuiper/tree/master/internal/topo/source/test/test.json)。如果文件格式是一个以行分隔的JSON字符串，它需要以 `lines` 格式定义。
			
 
				+- csv：支持逗号分隔的 csv 文件，也支持自定义分隔符。
			
 
				+- lines：以行分隔的文件。每行的解码方法可以通过流定义中的 `format` 参数来定义。例如，对于一个按行分隔的 JSON 字符串文件，文件类型应设置为 `lines`，格式应设置为 `json`，表示单行的格式为 json。
			
 
				 
			
 
				-Some files may have most of the data in standard format, but have some metadata in the opening and closing lines of the
			
 
				-file. The user can use the `ignoreStartLines` and `ignoreEndLines` arguments to remove the non-standard parts of the
			
 
				-beginning and end so that the above file types can be parsed.
			
 
				+有些文件可能有大部分数据是标准格式，但在文件的开头和结尾行有一些元数据。用户可以使用`ignoreStartLines`和`ignoreEndLines`参数来删除非标准的开头和结尾的非标准部分，这样上述文件类型就可以被解析了。
			
 
				 
			
 
				-### Example
			
 
				+### 示例
			
 
				 
			
 
				-File sources involve the parsing of file contents and intersect with format-related definitions in data streams. We
			
 
				-describe with some examples how to combine file types and formats for parsing file sources.
			
 
				+文件源涉及对文件内容的解析，同时解析格式与数据流中的格式定义相关。我们用一些例子来描述如何结合文件类型和格式设置来解析文件源。
			
 
				 
			
 
				-#### Read a csv with a custom separator
			
 
				+#### 读取自定义分隔符的 CSV 文件
			
 
				 
			
 
				-The standard csv separator is a comma, but there are a large number of files that use the csv-like format with custom
			
 
				-separators. Some csv-like files have column names defined in the first line instead of data.
			
 
				+标准的 csv 文件，分隔符是一个逗号，但是有大量的文件使用类 csv 格式，但使用自定义的分隔符。另外，一些类 csv 的文件在第一行定义了列名，而不是数据，如下例所示。
			
 
				 
			
 
				 ```csv
			
 
				 id name age
			
@@ -81,7 +72,7 @@ id name age
 
				 2 Jane 34
			
 
				 ```
			
 
				 
			
 
				-When the file is read, the configuration file is as follows, specifying that the file has a header.
			
 
				+该文件第一行为文件头，定义了文件的列名。读取这样的文件时，配置文件如下，需要指定文件有一个头。
			
 
				 
			
 
				 ```yaml
			
 
				 csv:
			
@@ -89,19 +80,16 @@ csv:
 
				   hasHeader: true
			
 
				 ```
			
 
				 
			
 
				-In the stream definition, set the stream data to ``DELIMITED`` format, specifying the separator with the ``DELIMITER``
			
 
				-parameter.
			
 
				+在流定义中，将流数据设置为 `DELIMITED` 格式，用 `DELIMITER` 参数指定分隔符为空格。
			
 
				 
			
 
				 ```SQL
			
 
				 create
			
 
				 stream cscFileDemo () WITH (FORMAT="DELIMITED", DATASOURCE="abc.csv", TYPE="file", DELIMITER=" ", CONF_KEY="csv"
			
 
				 ```
			
 
				 
			
 
				-#### Read multi-line JSON data
			
 
				+#### 读取多行 JSON 数据
			
 
				 
			
 
				-With a standard JSON file, the entire file should be a JSON object or an array. In practice, we often need to parse
			
 
				-files that contain multiple JSON objects. These files are not actually JSON themselves, but are considered to be
			
 
				-multiple lines of JSON data, assuming that each JSON object is a single line.
			
 
				+对于一个标准的 JSON 文件，整个文件应该是一个 JSON 对象或一个数组。在实践中，我们经常需要解析包含多个 JSON 对象的文件。这些文件实际上本身不是合法的 JSON 格式，但每行都是合法的 JSON 格式，可认为是多行JSON数据。
			
 
				 
			
 
				 ```text
			
 
				 {"id": 1, "name": "John Doe"}
			
@@ -109,19 +97,17 @@ multiple lines of JSON data, assuming that each JSON object is a single line.
 
				 {"id": 3, "name": "John Smith"}
			
 
				 ```
			
 
				 
			
 
				-When reading this file, the configuration file is as follows, specifying the file type as lines.
			
 
				+读取这种格式的文件时，配置中的文件类型设置为 `lines`。
			
 
				 
			
 
				 ```yaml
			
 
				 jsonlines:
			
 
				   fileType: lines
			
 
				 ```
			
 
				 
			
 
				-In the stream definition, set the stream data to be in ``JSON`` format.
			
 
				+在流定义中，设置流数据为`JSON`格式。
			
 
				 
			
 
				 ```SQL
			
 
				-create
			
 
				-stream linesFileDemo () WITH (FORMAT="JSON", TYPE="file", CONF_KEY="jsonlines"
			
 
				+create stream linesFileDemo () WITH (FORMAT="JSON", TYPE="file", CONF_KEY="jsonlines"
			
 
				 ```
			
 
				 
			
 
				-Moreover, the lines file type can be combined with any format. For example, if you set the format to protobuf and
			
 
				-configure the schema, it can be used to parse data that contains multiple Protobuf encoded lines.
			
 
				+此外，lines 文件类型可以与任何格式相结合。例如，如果你将格式设置为 protobuf，并且配置模式，它可以用来解析包含多个 Protobuf 编码行的数据。