A new file generator #336

webmat · 2019-02-20T21:19:41Z

This pull request introduces a simplified generator for files generated based on the ECS core files in schemas/*. The idea is to read everything, augment all fields with defaults, copy reusable fields to their intended destination, eventually lint for problems or add validations; then we save the intermediary in memory representation as a generated file (see generated/ecs/fields_flat.yml).

At this point, we can trigger a series of generators based on this. Either in Python, based on the in memory dictionary, or another language, based on this simplified and fully fleshed out intermediary file.

This PR introduces the following generators:

The one that saves the intermediary YML representation
The schema.csv file. It moves from the root to generated/schema.csv
Elasticsearch 6 and 7 sample templates, at generated/elasticsearch/*

This PR also introduces a few other things:

ECS version to be used in code generation is now saved in version at the root of the repo
Some Python tests for the generator
Python tests intended to spec ECS itself. The introductory test in this file ensures we don't introduce a bug where the base fields are nested under base.* :-) The file is scripts/tests/test_ecs_spec.py, and should be used for any high level truism we want to ensure about ECS itself. Not for typical corner cases and unit tests.
These tests run as part of make check in Travis, or can be called specifically with make test.
The new generator is automatically called in the global make generate, but you can run only the new generator with only make generator

TODO before merging

Generate different index template for v6 and v7
Remove template.json and schema.csv at the root (they're now in generated/*/)
Readme: point people to the generated files.
- gist: they'll all gradually move to generated/*/ from now on, except for docs
Set a default object_type for type: object fields? (only one without a default at this time)

Not in scope for this first PR, but should get attention soon:

adding validations / linting rules
generating asciidoc
generating the readme (will be replaced by asciidoc soon anyway, so will not be ported)
generating "perfect" beats yml defs
generating the Kibana JSON or the Go library
generating a sample Kibana index pattern in line with the ES templates
generating a Kibana canvas workpad to explore ECS

webmat · 2019-02-22T05:02:13Z

@MikePaquette Ok, the new generator is now able to generate the csv file and the template, including the reusable fields.

In the future I'd like to mostly output the generated files all in one place, in generated/*. But for the purpose of the pull request diffs, I'm also overwriting the old version of the two files, right at the root of the repo.

MikePaquette · 2019-02-22T19:22:22Z

@webmat love the new CSV output with the nested fields!
Question: Should we / can we include a version column in the CSV output with the ECS release version?
Seems a bit redundant to have it in every row, but we'd be able to import each CSV file into Elasticsearch and keep statistics and visualizations of ECS over time.

webmat · 2019-02-22T19:36:33Z

@MikePaquette I'm a bit hesitant about this, because as you say, it will be the same value for every single line.

I would rather recommend tweaking the import process or script to add this value to the destination, every time a new import is made of a new ECS version.

MikePaquette · 2019-02-22T20:00:44Z

Thanks @webmat my preference is the CSV file should have some indication of what version of ECS it represents, regardless of any processes that consume it downstream. Yes, putting version on every row seems a bit silly, but it works, and it less than 300 rows, so not a big deal from a space perspective.

webmat · 2019-02-22T20:03:22Z

Alright, let's do it, then :-)

- Flat field list generated in generated/ecs/fields_flat.yml - Nested field list (similar to structure of schemas/*.yml) in generated/ecs/fields_nested.yml - Reusable fields are correctly listed in fields_flat.yml only, at this time. - Added a separate test file that's for sanity checks. More about the state of the ECS spec than testing code corner cases.

…nerators

andrewkroh

The plan SGTM. I didn't spend much time on the python code, but converging all of the generation code to one place/one language that's fully encapsulated in this repo should be nice.

andrewkroh · 2019-02-25T21:18:35Z

scripts/generators/csv_generator.py

+                field['type'],
+                field['level'],
+                field.get('example', ''),
+                version


If this were a @since <version> type of field that indicates when the field was first added to the spec this would be useful to anyone trying to write backwards compatible code.

Good idea. That's not currently the intent, but I'm noting as a future improvement.

The gist of this field specifically is that one can repeatedly import schema.csv in a spreadsheet; once for each ECS version, and then have all versions all at their disposal. A request from Mike.

But I love the idea of the introduction version for each field... We'll see how we can work that in.

…tion link

This introduces a simplified generator for files, based on the ECS core files in `schemas/*`. The idea is to read everything, augment all fields with defaults, copy reusable fields to their intended destination, eventually lint for problems or add validations; then we save the intermediary in memory representation as a generated file (see `generated/ecs/fields_flat.yml`). At this point, we can trigger a series of generators based on this. Either in Python, based on the in memory dictionary, or another language, based on this simplified and fully fleshed out intermediary file. This PR introduces the following generators: - The one that saves the intermediary YML representation in `generated/ecs/` - The schema.csv file. It moves from the root to `generated/schema.csv` - Elasticsearch 6 and 7 sample templates, at `generated/elasticsearch/*` - The old schema.csv and template.json have been moved to `generated/legacy/` for the time being (still generated). This PR also introduces a few other things: - ECS version to be used in code generation is now saved in he file `version` at the root of the repo - Some Python tests for the generator - Python tests intended to spec ECS itself. The introductory test in this file ensures we don't introduce a bug where the base fields are nested under `base.*` :-) The file is `scripts/tests/test_ecs_spec.py`, and should be used for any high level truism we want to ensure about ECS itself. Not for typical corner cases and unit tests. - These tests run as part of `make check` in Travis, or can be called specifically with `make test`. - The new generator is automatically called in the global `make generate`, but you can run only the new generator with `make generator` #naming-legend - New section in the readme, pointing people to the generated files

Backport of PR #336 to 1.0 branch. Original message: This introduces a simplified generator for files, based on the ECS core files in `schemas/*`. The idea is to read everything, augment all fields with defaults, copy reusable fields to their intended destination, eventually lint for problems or add validations; then we save the intermediary in memory representation as a generated file (see `generated/ecs/fields_flat.yml`). At this point, we can trigger a series of generators based on this. Either in Python, based on the in memory dictionary, or another language, based on this simplified and fully fleshed out intermediary file. This PR introduces the following generators: - The one that saves the intermediary YML representation in `generated/ecs/` - The schema.csv file. It moves from the root to `generated/schema.csv` - Elasticsearch 6 and 7 sample templates, at `generated/elasticsearch/*` - The old schema.csv and template.json have been moved to `generated/legacy/` for the time being (still generated). This PR also introduces a few other things: - ECS version to be used in code generation is now saved in he file `version` at the root of the repo - Some Python tests for the generator - Python tests intended to spec ECS itself. The introductory test in this file ensures we don't introduce a bug where the base fields are nested under `base.*` :-) The file is `scripts/tests/test_ecs_spec.py`, and should be used for any high level truism we want to ensure about ECS itself. Not for typical corner cases and unit tests. - These tests run as part of `make check` in Travis, or can be called specifically with `make test`. - The new generator is automatically called in the global `make generate`, but you can run only the new generator with `make generator` #naming-legend - New section in the readme, pointing people to the generated files * Re-generate files after rebasing

webmat added 1.0.0-ga needs_backport in progress labels Feb 20, 2019

webmat changed the title ~~WIP of a new generator for the files.~~ WIP of a new file generator Feb 20, 2019

webmat force-pushed the generator-memory-repr branch 4 times, most recently from 0fd3351 to 00bdb10 Compare February 22, 2019 04:23

webmat mentioned this pull request Feb 22, 2019

Root mapping definition has unsupported parameters #337

Closed

webmat force-pushed the generator-memory-repr branch from 00bdb10 to 9f272a4 Compare February 22, 2019 20:32

Mathieu Martin added 15 commits February 22, 2019 16:24

Start rewriting generator

b9e3963

Set base fieldset nesting

50a9c57

Set most defaults, test some cases

36b43ec

Add some multi_fields support

b9f7d73

Recent field def format updates

25f8324

Set base.root instead of base.prefix

8bfd193

Less bold

05b91b9

Set root:true for base field set

ec1bd89

Autopep talk

8612a85

make unit, make generator

da9bebd

Separate out the running of the generator to generator.py

ec6a2f6

First generator, csv

7a42a2a

I miss Ruby

8b25e01

fields.csv now lists base fields first

fb1715a

Mathieu Martin added 7 commits February 22, 2019 16:24

code format

85a444d

Actually run the old generator as well

ef6242a

Remove new build step for tests. Tests now run as part of make check

582c68f

The generator for intermediary files isn't special. Now with other ge…

4ddef92

…nerators

Add version file

35f3baa

Generate CSV with version column

6c8083c

No longer hardcode version, when generating ES template

d4f2b54

webmat force-pushed the generator-memory-repr branch from 9f272a4 to d4f2b54 Compare February 22, 2019 21:26

webmat mentioned this pull request Feb 22, 2019

Change url.port datatype to long #339

Merged

Mathieu Martin added 2 commits February 22, 2019 16:40

Code format

61bc73a

Revert the fix for url.port, now addressed in elastic#339

eacbda3

andrewkroh approved these changes Feb 25, 2019

View reviewed changes

Mathieu Martin added 2 commits February 25, 2019 16:33

Clarify that this is the ECS version

179c903

Generate both ES 6 and 7 sample templates

109ee45

andrewkroh mentioned this pull request Feb 25, 2019

Dots in field names for generated Go code #343

Closed

Mathieu Martin added 4 commits February 26, 2019 10:04

Output to deprecated files to generated/legacy for now.

6fc3efe

Point people to generated/* from the readme. Also added a missing sec…

eced739

…tion link

Code format

e30eee5

Set default object_type on type: object fields.

4606ba9

webmat changed the title ~~WIP of a new file generator~~ A new file generator Feb 26, 2019

webmat merged commit ca9d77f into elastic:master Feb 26, 2019

webmat deleted the generator-memory-repr branch February 26, 2019 18:46

This was referenced Feb 26, 2019

Finish full support of "reusable" field sets #328

Closed

Provide support or tooling to help generate an ECS template #324

Closed

webmat removed the in progress label Mar 4, 2019

webmat mentioned this pull request Mar 5, 2019

Backport #336 to 1.0: A new file generator (#336) #366

Merged

webmat removed the needs_backport label Mar 5, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

A new file generator #336

A new file generator #336

webmat commented Feb 20, 2019 •

edited

Loading

webmat commented Feb 22, 2019

MikePaquette commented Feb 22, 2019

webmat commented Feb 22, 2019

MikePaquette commented Feb 22, 2019

webmat commented Feb 22, 2019

andrewkroh left a comment

andrewkroh Feb 25, 2019

webmat Feb 25, 2019 •

edited

Loading

A new file generator #336

A new file generator #336

Conversation

webmat commented Feb 20, 2019 • edited Loading

webmat commented Feb 22, 2019

MikePaquette commented Feb 22, 2019

webmat commented Feb 22, 2019

MikePaquette commented Feb 22, 2019

webmat commented Feb 22, 2019

andrewkroh left a comment

Choose a reason for hiding this comment

andrewkroh Feb 25, 2019

Choose a reason for hiding this comment

webmat Feb 25, 2019 • edited Loading

Choose a reason for hiding this comment

webmat commented Feb 20, 2019 •

edited

Loading

webmat Feb 25, 2019 •

edited

Loading