diff --git a/docs/content.zh/docs/dev/datastream/testing.md b/docs/content.zh/docs/dev/datastream/testing.md index 5891769881bc3..70c9f9c24e4d9 100644 --- a/docs/content.zh/docs/dev/datastream/testing.md +++ b/docs/content.zh/docs/dev/datastream/testing.md @@ -112,7 +112,7 @@ public class StatefulFlatMapTest { //instantiate user-defined function statefulFlatMapFunction = new StatefulFlatMapFunction(); - // wrap user defined function into a the corresponding operator + // wrap user defined function into the corresponding operator testHarness = new OneInputStreamOperatorTestHarness<>(new StreamFlatMap<>(statefulFlatMapFunction)); // optionally configured the execution environment @@ -158,7 +158,7 @@ public class StatefulFlatMapFunctionTest { //instantiate user-defined function statefulFlatMapFunction = new StatefulFlatMapFunction(); - // wrap user defined function into a the corresponding operator + // wrap user defined function into the corresponding operator testHarness = new KeyedOneInputStreamOperatorTestHarness<>(new StreamFlatMap<>(statefulFlatMapFunction), new MyStringKeySelector(), Types.STRING); // open the test harness (will also call open() on RichFunctions) @@ -204,7 +204,7 @@ public class PassThroughProcessFunctionTest { //instantiate user-defined function PassThroughProcessFunction processFunction = new PassThroughProcessFunction(); - // wrap user defined function into a the corresponding operator + // wrap user defined function into the corresponding operator OneInputStreamOperatorTestHarness harness = ProcessFunctionTestHarnesses .forProcessFunction(processFunction); diff --git a/docs/content.zh/docs/ops/debugging/debugging_classloading.md b/docs/content.zh/docs/ops/debugging/debugging_classloading.md index cc36a2449e104..423d55299fba9 100644 --- a/docs/content.zh/docs/ops/debugging/debugging_classloading.md +++ b/docs/content.zh/docs/ops/debugging/debugging_classloading.md @@ -54,7 +54,7 @@ Flink应用程序运行时,JVM会随着时间不断加载各种不同的类。 If you package a Flink job/application such that your application treats Flink like a library (JobManager/TaskManager daemons as spawned as needed), then typically all classes are in the *application classpath*. This is the recommended way for container-based setups where the container is specifically -created for an job/application and will contain the job/application's jar files. +created for a job/application and will contain the job/application's jar files. --> diff --git a/docs/content.zh/docs/ops/events.md b/docs/content.zh/docs/ops/events.md index d8ba0459f6dab..3b66504da3f10 100644 --- a/docs/content.zh/docs/ops/events.md +++ b/docs/content.zh/docs/ops/events.md @@ -28,7 +28,7 @@ under the License. # Events -Flink exposes a event reporting system that allows gathering and exposing events to external systems. +Flink exposes an event reporting system that allows gathering and exposing events to external systems. ## Reporting events diff --git a/docs/content.zh/docs/sql/reference/ddl/create.md b/docs/content.zh/docs/sql/reference/ddl/create.md index 6a9fcaafe7698..77c7f4de60803 100644 --- a/docs/content.zh/docs/sql/reference/ddl/create.md +++ b/docs/content.zh/docs/sql/reference/ddl/create.md @@ -230,7 +230,7 @@ CREATE TABLE MyTable ( Metadata columns are an extension to the SQL standard and allow to access connector and/or format specific fields for every row of a table. A metadata column is indicated by the `METADATA` keyword. For example, -a metadata column can be be used to read and write the timestamp from and to Kafka records for time-based +a metadata column can be used to read and write the timestamp from and to Kafka records for time-based operations. The [connector and format documentation]({{< ref "docs/connectors/table/overview" >}}) lists the available metadata fields for every component. However, declaring a metadata column in a table's schema is optional. diff --git a/docs/content.zh/release-notes/flink-2.2.md b/docs/content.zh/release-notes/flink-2.2.md index 69695209c9e38..2a822ab7c4f52 100644 --- a/docs/content.zh/release-notes/flink-2.2.md +++ b/docs/content.zh/release-notes/flink-2.2.md @@ -116,7 +116,7 @@ BINARY and VARBINARY should now correctly consider the target length. ##### [FLINK-38209](https://issues.apache.org/jira/browse/FLINK-38209) -This is considerable optimization and an breaking change for the StreamingMultiJoinOperator. +This is considerable optimization and a breaking change for the StreamingMultiJoinOperator. As noted in the release notes, the operator was launched in an experimental state for Flink 2.1 since we're working on relevant optimizations that could be breaking changes. diff --git a/docs/content/docs/connectors/table/formats/raw.md b/docs/content/docs/connectors/table/formats/raw.md index 9596eb07fcd76..339736cbaac7b 100644 --- a/docs/content/docs/connectors/table/formats/raw.md +++ b/docs/content/docs/connectors/table/formats/raw.md @@ -58,7 +58,7 @@ CREATE TABLE nginx_log ( ) ``` -Then you can read out the raw data as a pure string, and split it into multiple fields using an user-defined-function for further analysing, e.g. `my_split` in the example. +Then you can read out the raw data as a pure string, and split it into multiple fields using a user-defined-function for further analysing, e.g. `my_split` in the example. ```sql SELECT t.hostname, t.datetime, t.url, t.browser, ... diff --git a/docs/content/docs/dev/datastream/application_parameters.md b/docs/content/docs/dev/datastream/application_parameters.md index 6583c5c8a7184..d26566b96d793 100644 --- a/docs/content/docs/dev/datastream/application_parameters.md +++ b/docs/content/docs/dev/datastream/application_parameters.md @@ -94,7 +94,7 @@ parameters.getNumberOfParameters(); ``` You can use the return values of these methods directly in the `main()` method of the client submitting the application. -For example, you could set the parallelism of a operator like this: +For example, you could set the parallelism of an operator like this: ```java ParameterTool parameters = ParameterTool.fromArgs(args); diff --git a/docs/content/docs/dev/datastream/testing.md b/docs/content/docs/dev/datastream/testing.md index 34be49d0820ff..bf040163d141d 100644 --- a/docs/content/docs/dev/datastream/testing.md +++ b/docs/content/docs/dev/datastream/testing.md @@ -111,7 +111,7 @@ public class StatefulFlatMapTest { //instantiate user-defined function statefulFlatMapFunction = new StatefulFlatMapFunction(); - // wrap user defined function into a the corresponding operator + // wrap user defined function into the corresponding operator testHarness = new OneInputStreamOperatorTestHarness<>(new StreamFlatMap<>(statefulFlatMapFunction)); // optionally configured the execution environment @@ -157,7 +157,7 @@ public class StatefulFlatMapFunctionTest { //instantiate user-defined function statefulFlatMapFunction = new StatefulFlatMapFunction(); - // wrap user defined function into a the corresponding operator + // wrap user defined function into the corresponding operator testHarness = new KeyedOneInputStreamOperatorTestHarness<>(new StreamFlatMap<>(statefulFlatMapFunction), new MyStringKeySelector(), Types.STRING); // open the test harness (will also call open() on RichFunctions) @@ -203,7 +203,7 @@ public class PassThroughProcessFunctionTest { //instantiate user-defined function PassThroughProcessFunction processFunction = new PassThroughProcessFunction(); - // wrap user defined function into a the corresponding operator + // wrap user defined function into the corresponding operator OneInputStreamOperatorTestHarness harness = ProcessFunctionTestHarnesses .forProcessFunction(processFunction); diff --git a/docs/content/docs/dev/table/functions/udfs.md b/docs/content/docs/dev/table/functions/udfs.md index 9441a28e1581b..7543bd91f134b 100644 --- a/docs/content/docs/dev/table/functions/udfs.md +++ b/docs/content/docs/dev/table/functions/udfs.md @@ -1666,7 +1666,7 @@ def retract(accumulator: ACC, [user defined inputs]): Unit * be noted that the accumulator may contain the previous aggregated * results. Therefore user should not replace or clean this instance in the * custom merge method. - * param: iterable an java.lang.Iterable pointed to a group of accumulators that will be + * param: iterable a java.lang.Iterable pointed to a group of accumulators that will be * merged. */ public void merge(ACC accumulator, java.lang.Iterable iterable) @@ -1682,7 +1682,7 @@ public void merge(ACC accumulator, java.lang.Iterable iterable) * be noted that the accumulator may contain the previous aggregated * results. Therefore user should not replace or clean this instance in the * custom merge method. - * param: iterable an java.lang.Iterable pointed to a group of accumulators that will be + * param: iterable a java.lang.Iterable pointed to a group of accumulators that will be * merged. */ def merge(accumulator: ACC, iterable: java.lang.Iterable[ACC]): Unit @@ -2031,7 +2031,7 @@ def retract(accumulator: ACC, [user defined inputs]): Unit * be noted that the accumulator may contain the previous aggregated * results. Therefore user should not replace or clean this instance in the * custom merge method. - * param: iterable an java.lang.Iterable pointed to a group of accumulators that will be + * param: iterable a java.lang.Iterable pointed to a group of accumulators that will be * merged. */ public void merge(ACC accumulator, java.lang.Iterable iterable) @@ -2047,7 +2047,7 @@ public void merge(ACC accumulator, java.lang.Iterable iterable) * be noted that the accumulator may contain the previous aggregated * results. Therefore user should not replace or clean this instance in the * custom merge method. - * param: iterable an java.lang.Iterable pointed to a group of accumulators that will be + * param: iterable a java.lang.Iterable pointed to a group of accumulators that will be * merged. */ def merge(accumulator: ACC, iterable: java.lang.Iterable[ACC]): Unit diff --git a/docs/content/docs/ops/batch/batch_shuffle.md b/docs/content/docs/ops/batch/batch_shuffle.md index b28add22c46b1..bb6f99ac4828a 100644 --- a/docs/content/docs/ops/batch/batch_shuffle.md +++ b/docs/content/docs/ops/batch/batch_shuffle.md @@ -173,7 +173,7 @@ Here are some exceptions you may encounter (rarely) and the corresponding soluti | :--------- | :------------------ | | Insufficient number of network buffers | This means the amount of network memory is not enough to run the target job and you need to increase the total network memory size. Note that since 1.15, `Sort Shuffle` has become the default blocking shuffle implementation and for some cases, it may need more network memory than before, which means there is a small possibility that your batch jobs may suffer from this issue after upgrading to 1.15. If this is the case, you just need to increase the total network memory size. | | Too many open files | This means that the file descriptors is not enough. If you are using `Hash Shuffle`, please switch to `Sort Shuffle`. If you are already using `Sort Shuffle`, please consider increasing the system limit for file descriptor and check if the user code consumes too many file descriptors. | -| Connection reset by peer | This usually means that the network is unstable or or under heavy burden. Other issues like SSL handshake timeout mentioned above may also cause this problem. If you are using `Hash Shuffle`, please switch to `Sort Shuffle`. If you are already using `Sort Shuffle`, increasing the [network backlog]({{< ref "docs/deployment/config" >}}#taskmanager-network-netty-server-backlog) may help. | +| Connection reset by peer | This usually means that the network is unstable or under heavy burden. Other issues like SSL handshake timeout mentioned above may also cause this problem. If you are using `Hash Shuffle`, please switch to `Sort Shuffle`. If you are already using `Sort Shuffle`, increasing the [network backlog]({{< ref "docs/deployment/config" >}}#taskmanager-network-netty-server-backlog) may help. | | Network connection timeout | This usually means that the network is unstable or under heavy burden and increasing the [network connection timeout]({{< ref "docs/deployment/config" >}}#taskmanager-network-netty-client-connectTimeoutSec) or enable [connection retry]({{< ref "docs/deployment/config" >}}#taskmanager-network-retries) may help. | | Socket read/write timeout | This may indicate that the network is slow or under heavy burden and increasing the [network send/receive buffer size]({{< ref "docs/deployment/config" >}}#taskmanager-network-netty-sendReceiveBufferSize) may help. If the job is running in Kubernetes environment, using [host network]({{< ref "docs/deployment/config" >}}#kubernetes-hostnetwork-enabled) may also help. | | Read buffer request timeout | This can happen only when you are using `Sort Shuffle` and it means a fierce contention of the shuffle read memory. To solve the issue, you can increase [taskmanager.memory.framework.off-heap.batch-shuffle.size]({{< ref "docs/deployment/config" >}}#taskmanager-memory-framework-off-heap-batch-shuffle-size) together with [taskmanager.memory.framework.off-heap.size]({{< ref "docs/deployment/config" >}}#taskmanager-memory-framework-off-heap-size). | @@ -188,7 +188,7 @@ Here are some exceptions you may encounter (rarely) and the corresponding soluti | Exceptions | Potential Solutions | |:--------------------------------------------|:---------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------------| | Insufficient number of network buffers | This means the amount of network memory is not enough to run the target job and you need to increase the total network memory size. | | -| Connection reset by peer | This usually means that the network is unstable or or under heavy burden. Other issues like SSL handshake timeout may also cause this problem. Increasing the [network backlog]({{< ref "docs/deployment/config" >}}#taskmanager-network-netty-server-backlog) may help. | +| Connection reset by peer | This usually means that the network is unstable or under heavy burden. Other issues like SSL handshake timeout may also cause this problem. Increasing the [network backlog]({{< ref "docs/deployment/config" >}}#taskmanager-network-netty-server-backlog) may help. | | Network connection timeout | This usually means that the network is unstable or under heavy burden and increasing the [network connection timeout]({{< ref "docs/deployment/config" >}}#taskmanager-network-netty-client-connectTimeoutSec) or enable [connection retry]({{< ref "docs/deployment/config" >}}#taskmanager-network-retries) may help. | | Socket read/write timeout | This may indicate that the network is slow or under heavy burden and increasing the [network send/receive buffer size]({{< ref "docs/deployment/config" >}}#taskmanager-network-netty-sendReceiveBufferSize) may help. If the job is running in Kubernetes environment, using [host network]({{< ref "docs/deployment/config" >}}#kubernetes-hostnetwork-enabled) may also help. | | Read buffer request timeout | This means a fierce contention of the shuffle read memory. To solve the issue, you can increase [taskmanager.memory.framework.off-heap.batch-shuffle.size]({{< ref "docs/deployment/config" >}}#taskmanager-memory-framework-off-heap-batch-shuffle-size) together with [taskmanager.memory.framework.off-heap.size]({{< ref "docs/deployment/config" >}}#taskmanager-memory-framework-off-heap-size). | diff --git a/docs/content/docs/ops/debugging/debugging_classloading.md b/docs/content/docs/ops/debugging/debugging_classloading.md index 99da741e92b6a..34f2fa20dec38 100644 --- a/docs/content/docs/ops/debugging/debugging_classloading.md +++ b/docs/content/docs/ops/debugging/debugging_classloading.md @@ -57,7 +57,7 @@ Java classpath. The classes from all jobs/applications that are submitted agains If you package a Flink job/application such that your application treats Flink like a library (JobManager/TaskManager daemons as spawned as needed), then typically all classes are in the *application classpath*. This is the recommended way for container-based setups where the container is specifically -created for an job/application and will contain the job/application's jar files. +created for a job/application and will contain the job/application's jar files. --> diff --git a/docs/content/docs/ops/events.md b/docs/content/docs/ops/events.md index d8ba0459f6dab..3b66504da3f10 100644 --- a/docs/content/docs/ops/events.md +++ b/docs/content/docs/ops/events.md @@ -28,7 +28,7 @@ under the License. # Events -Flink exposes a event reporting system that allows gathering and exposing events to external systems. +Flink exposes an event reporting system that allows gathering and exposing events to external systems. ## Reporting events diff --git a/docs/content/docs/ops/upgrading.md b/docs/content/docs/ops/upgrading.md index 4a548cfcf8239..be72ce9b0f4e4 100644 --- a/docs/content/docs/ops/upgrading.md +++ b/docs/content/docs/ops/upgrading.md @@ -141,7 +141,7 @@ DataStream mappedEvents = events **Important:** As of 1.3.x this also applies to operators that are part of a chain. -By default all state stored in a savepoint must be matched to the operators of a starting application. However, users can explicitly agree to skip (and thereby discard) state that cannot be matched to an operator when starting a application from a savepoint. Stateful operators for which no state is found in the savepoint are initialized with their default state. Users may enforce best practices by calling `ExecutionConfig#disableAutoGeneratedUIDs` which will fail the job submission if any operator does not contain a custom unique ID. +By default all state stored in a savepoint must be matched to the operators of a starting application. However, users can explicitly agree to skip (and thereby discard) state that cannot be matched to an operator when starting an application from a savepoint. Stateful operators for which no state is found in the savepoint are initialized with their default state. Users may enforce best practices by calling `ExecutionConfig#disableAutoGeneratedUIDs` which will fail the job submission if any operator does not contain a custom unique ID. #### Stateful Operators and User Functions diff --git a/docs/content/docs/sql/reference/ddl/create.md b/docs/content/docs/sql/reference/ddl/create.md index 8140c9e9cc724..f4591011445f9 100644 --- a/docs/content/docs/sql/reference/ddl/create.md +++ b/docs/content/docs/sql/reference/ddl/create.md @@ -225,7 +225,7 @@ CREATE TABLE MyTable ( Metadata columns are an extension to the SQL standard and allow to access connector and/or format specific fields for every row of a table. A metadata column is indicated by the `METADATA` keyword. For example, -a metadata column can be be used to read and write the timestamp from and to Kafka records for time-based +a metadata column can be used to read and write the timestamp from and to Kafka records for time-based operations. The [connector and format documentation]({{< ref "docs/connectors/table/overview" >}}) lists the available metadata fields for every component. However, declaring a metadata column in a table's schema is optional. diff --git a/docs/content/docs/sql/reference/queries/hints.md b/docs/content/docs/sql/reference/queries/hints.md index 519cdf193cf77..e2cd20fc15890 100644 --- a/docs/content/docs/sql/reference/queries/hints.md +++ b/docs/content/docs/sql/reference/queries/hints.md @@ -604,7 +604,7 @@ e.g., if there is a row in the `Customers` table: ```gitexclude id=100, country='CN' ``` -When processing an record with 'id=100' in the order stream, in 'jdbc' connector, the corresponding +When processing a record with 'id=100' in the order stream, in 'jdbc' connector, the corresponding lookup result is null (`country='CN'` does not satisfy the condition `c.country = 'US'`) because both `c.id` and `c.country` are used as lookup keys, so this will trigger a retry. diff --git a/docs/content/docs/sql/reference/queries/joins.md b/docs/content/docs/sql/reference/queries/joins.md index be99c737d3f58..d0e1c4882ab06 100644 --- a/docs/content/docs/sql/reference/queries/joins.md +++ b/docs/content/docs/sql/reference/queries/joins.md @@ -393,7 +393,7 @@ order_1 shirt 2 1 order_1 pants 1 2 order_1 hat 1 3 --- Returns a new row for each instance of a element in a multiset +-- Returns a new row for each instance of an element in a multiset -- If an element has been seen twice (multiplicity is 2), it will be returned twice WITH ProductMultiset AS (SELECT COLLECT(product_name) AS product_multiset diff --git a/docs/content/docs/sql/reference/queries/window-tvf.md b/docs/content/docs/sql/reference/queries/window-tvf.md index 5553cc6fcf140..cdb5fa67c20ee 100644 --- a/docs/content/docs/sql/reference/queries/window-tvf.md +++ b/docs/content/docs/sql/reference/queries/window-tvf.md @@ -154,7 +154,7 @@ The return value of `HOP` is a new relation that includes all columns of origina HOP(TABLE data, DESCRIPTOR(timecol), slide, size [, offset ]) ``` -- `data`: is a table parameter that can be any relation with an time attribute column. +- `data`: is a table parameter that can be any relation with a time attribute column. - `timecol`: is a column descriptor indicating which time attributes column of data should be mapped to hopping windows. - `slide`: is a duration specifying the duration between the start of sequential hopping windows - `size`: is a duration specifying the width of the hopping windows. @@ -225,7 +225,7 @@ The return value of `CUMULATE` is a new relation that includes all columns of or CUMULATE(TABLE data, DESCRIPTOR(timecol), step, size) ``` -- `data`: is a table parameter that can be any relation with an time attribute column. +- `data`: is a table parameter that can be any relation with a time attribute column. - `timecol`: is a column descriptor indicating which time attributes column of data should be mapped to cumulating windows. - `step`: is a duration specifying the increased window size between the end of sequential cumulating windows. - `size`: is a duration specifying the max width of the cumulating windows. `size` must be an integral multiple of `step`. @@ -316,7 +316,7 @@ The original time attribute "timecol" will be a regular timestamp column after w SESSION(TABLE data [PARTITION BY(keycols, ...)], DESCRIPTOR(timecol), gap) ``` -- `data`: is a table parameter that can be any relation with an time attribute column. +- `data`: is a table parameter that can be any relation with a time attribute column. - `keycols`: is a column descriptor indicating which columns should be used to partition the data prior to session windows. - `timecol`: is a column descriptor indicating which time attributes column of data should be mapped to session windows. - `gap`: is the maximum interval in timestamp for two events to be considered part of the same session window. diff --git a/docs/content/docs/sql/reference/utility/call.md b/docs/content/docs/sql/reference/utility/call.md index 962d5f384fec7..03469649da3cc 100644 --- a/docs/content/docs/sql/reference/utility/call.md +++ b/docs/content/docs/sql/reference/utility/call.md @@ -30,7 +30,7 @@ under the License. `Call` statements are used to call a stored procedure which is usually provided to perform data manipulation or administrative tasks. Attention Currently, `Call` statements require the procedure called to exist in the corresponding catalog. So, please make sure the procedure exists in the catalog. -If it doesn't exist, it'll throw an exception. You may need to refer to the doc of the catalog to see the available procedures. To implement an procedure, please refer to [Procedure]({{< ref "docs/dev/table/procedures" >}}). +If it doesn't exist, it'll throw an exception. You may need to refer to the doc of the catalog to see the available procedures. To implement a procedure, please refer to [Procedure]({{< ref "docs/dev/table/procedures" >}}). ## Run a CALL statement diff --git a/docs/content/release-notes/flink-2.2.md b/docs/content/release-notes/flink-2.2.md index e5b5f142bfd57..cfa2e61dd5fac 100644 --- a/docs/content/release-notes/flink-2.2.md +++ b/docs/content/release-notes/flink-2.2.md @@ -116,7 +116,7 @@ BINARY and VARBINARY should now correctly consider the target length. ##### [FLINK-38209](https://issues.apache.org/jira/browse/FLINK-38209) -This is considerable optimization and an breaking change for the StreamingMultiJoinOperator. +This is considerable optimization and a breaking change for the StreamingMultiJoinOperator. As noted in the release notes, the operator was launched in an experimental state for Flink 2.1 since we're working on relevant optimizations that could be breaking changes.