Skip to content

Commit

Permalink
Backport #266 to 1.0: Convert ECS doc from MD to asciidoc guide: init…
Browse files Browse the repository at this point in the history
…ial setup (#266) (#370)

Backport of PR #266 to 1.0 branch. Original message:

* ECS doc conversion from md to asciidoc
* Add back uc-header file
* New asciidoc files
  • Loading branch information
webmat authored Mar 5, 2019
1 parent 2807ce6 commit d87a970
Show file tree
Hide file tree
Showing 8 changed files with 668 additions and 0 deletions.
8 changes: 8 additions & 0 deletions docs/contributing.asciidoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,8 @@
[[ecs-contributing]]
== Contributing to {ecs}

All information related to ECS is versioned in the https://github.com/elastic/ecs[elastic/ecs] repository. All
changes to ECS happen through Pull Requests submitted through Git.

See the https://github.com/elastic/ecs/blob/master/CONTRIBUTING.md[Contribution Guidelines].

76 changes: 76 additions & 0 deletions docs/conventions.asciidoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
//[[ecs-conventions]]
== {ecs} Conventions

{ecs} is most effective when you understand and follow conventions.

[float]
=== Multi-fields text indexing

Elasticsearch can index text using:

* *Text.* Text indexing allows for full text search, or searching arbitrary words that
are part of the field.
See https://www.elastic.co/guide/en/elasticsearch/reference/current/text.html[Text datatype]
in the {es} Reference Guide.
* *Keywords.* Keyword indexing offers faster exact match filtering and prefix search,
and makes aggregations (for Kibana visualizations) possible.
See the {es} Reference Guide for more information on
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-term-query.html[exact match filtering],
https://www.elastic.co/guide/en/elasticsearch/reference/current/query-dsl-prefix-query.html[prefix search], or
https://www.elastic.co/guide/en/elasticsearch/reference/current/search-aggregations.html[aggregations].


[float]
==== Default Elasticsearch convention

Unless your index mapping or index template specifies otherwise
(as the ECS index template does),
Elasticsearch indexes text field as `text` at the canonical field name,
and indexes a second time as `keyword`, nested in a multi-field.

Default Elasticsearch convention:

* Canonical field: `myfield` is `text`
* Multi-field: `myfield.keyword` is `keyword`

[float]
==== ECS multi-field convention for text

For monitoring use cases, `keyword` indexing is needed almost exclusively, with
full text search on very few fields. Given this premise, ECS defaults
all text indexing to `keyword` at the top level (with very few exceptions).
Any use case that requires full text search indexing on additional fields
https://www.elastic.co/guide/en/elasticsearch/reference/current/multi-fields.html[multi-field].
for full text search. Doing so does not conflict with ECS,
as the canonical field name will remain `keyword` indexed.

ECS multi-field convention for text:

* Canonical field: `myfield` is `keyword`
* Multi-field: `myfield.text` is `text`

[float]
==== Exceptions

The only exceptions to this convention are fields `message` and `error.message`,
which are indexed for full text search only, with no multi-field.
These two fields don't follow the new convention because they are deemed too big
of a breaking change with these two widely used fields in Beats.

Any future field that will be indexed for full text search in ECS will however
follow the multi-field convention where `text` indexing is nested in the multi-field.

[float]
=== IDs and most codes are keywords, not integers

Despite the fact that IDs and codes (e.g. error codes) are often integers,
this is not always the case.
Since we want to make it possible to map as many systems and data sources
to ECS as possible, we default to using the `keyword` type for IDs and codes.

Some specific kinds of codes are always integers, like HTTP status codes.
If those have a specific corresponding specific field (as HTTP status does),
its type can safely be an integer type.
But generic field like `error.code` cannot have this guarantee, and are therefore `keyword`.


386 changes: 386 additions & 0 deletions docs/fields-gen.asciidoc

Large diffs are not rendered by default.

57 changes: 57 additions & 0 deletions docs/fields.asciidoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,57 @@
[[ecs-fields]]
== {ecs} Fields

// Add a list of field types w/ brief description so user can get a
// sense of what we're offering without having to scroll.

// Pull in generated field content using `include` statements

[cols="<,<",options="header",]
|=======================================================================
| Fields | Description
| <<ecs-base,Base>> | The base set contains all fields which are on the top level.
These fields are common across all types of events.
| <<ecs-agent,Agent>> | The agent fields contain data about the
agent/client/shipper that created the event.
| <<ecs-cloud,Cloud>> | Fields related to the cloud or infrastructure the events are
coming from.
| <<ecs-container,Container>> | Container fields are used for meta information about the specific container that
is the source of information. These fields help correlate data based containers
from any runtime.
| <<ecs-destination,Destination>> | Destination fields describe details about the destination of a packet/event.
| <<ecs-device,Device>> | Device fields are used to provide additional information
about the device that is the source of the information. This could be a firewall, network device, etc.
| <<ecs-ecs,ECS>> | Meta-information specific to ECS.
| <<ecs-error,Error>> | These fields can represent errors of any kind. Use them for errors that happen
while fetching events or in cases where the event itself contains an error.
| <<ecs-event,Event>> | The event fields are used for context information about the data itself.
| <<ecs-file,File>> | File fields provide details about each file.
| <<ecs-geo,Geo>> | Geo fields can carry data about a specific location related to
an event or geo information derived from an IP field.
The `geo` fields are expected to be nested at: `destination.geo`, `device.geo`, `host.geo`, `source.geo`.
| <<ecs-host,Host>> | Host fields provide information related to a host. A host can be a physical
machine, a virtual machine, or a Docker container.
| <<ecs-log,Log>> | Fields which are specific to log events.
| <<ecs-network,Network>> | Fields related to network data.
| <<ecs-organization,Organization>> | The organization fields enrich data with
information about the company or entity the data is associated with. These fields help
you arrange or filter data stored in an index by one or multiple organizations.
| <<ecs-os,Operating System>> | The OS fields contain information about the operating system.
| <<ecs-process,Process>> | These fields contain information about a process. These fields can help you
correlate metrics information with a process id/name from a log message. The
`process.pid` often stays in the metric itself and is copied to the global field
for correlation.
| <<ecs-service,Service>> | The service fields describe the service for or from which the data was
collected. These fields help you find and correlate logs for a specific service
and version.
| <<ecs-source,Source>> | Source fields describe details about the destination of a packet/event.
| <<ecs-url,URL>> | URL fields provide a complete URL, with scheme, host, and path.
| <<ecs-user,User>> | The user fields describe information about the user
that is relevant to the event. Fields can have one entry or multiple entries.
If a user has more than one id, provide an array that includes all of them.
|=======================================================================

include::fields-gen.asciidoc[]



12 changes: 12 additions & 0 deletions docs/glossary.asciidoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,12 @@
//[[ecs-glossary]]
== Glossary of {ecs} Terms

[[glossary-ecs]]
ECS ::

Elastic Common Schema. A common set of document fields, field names, and their respective entity
relationships to be used in the storage of log messages and other data in
Elasticsearch.



33 changes: 33 additions & 0 deletions docs/guidelines.asciidoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,33 @@
//[[ecs-guidelines]]
== Guidelines and Best Practices

The {ecs} schema serves best when you follow schema guidelines and best
practices.

[float]
=== General guidelines

* The document MUST have the `@timestamp` field.
* Use the https://www.elastic.co/guide/en/elasticsearch/reference/current/mapping-types.html[data type]
defined for an ECS field.
* Use the `ecs.version` field to define which version of ECS is used.
* Map as many fields as possible to ECS.
* TBD: Include guidelines on when people should contribute to the spec. Link to Contributing.

[float]
==== Guidelines for writing fields

* All fields must be lower case
* Combine words using underscore
* No special characters except `_`

[float]
==== Guidelines for naming fields

* *Present tense.* Use present tense unless field describes historical information.
* *Singular or plural.* Use singular and plural names properly to reflect the field content. For example, use `requests_per_sec` rather than `request_per_sec`.
* *General to specific.* Organise the prefixes from general to specific to allow grouping fields into objects with a prefix like `host.*`.
* *Avoid repetition.* Avoid stuttering of words. If part of the field name is already in the prefix, do not repeat it. Example: `host.host_ip` should be `host.ip`.
* *Use prefixes.* Fields must be prefixed except for the base fields. For example all `host` fields are prefixed with `host.`. See `dot` notation in FAQ for more details.
* *Avoid abbreviations when possible*. A few exceptions like `ip` exist.

73 changes: 73 additions & 0 deletions docs/index.asciidoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,73 @@
:doctype: book
:ecs: ECS


include::{asciidoc-dir}/../../shared/versions.asciidoc[]
include::{asciidoc-dir}/../../shared/attributes.asciidoc[]

[[ecs-reference]]
== Elastic Common Schema (ECS) Reference

beta[]

The Elastic Common Schema (ECS) defines a common set of fields for
ingesting data into Elasticsearch. A common schema helps you correlate
data from sources like logs and metrics or IT operations
analytics and security analytics.

ECS is still under development and backward compatibility is not guaranteed. Any
feedback on the general structure, missing fields, or existing fields is appreciated.
For contributions please read the https://github.com/elastic/ecs/blob/master/CONTRIBUTING.md[Contribution Guidelines].

[float]
[[versions]]
=== Versions

The master branch of this repository should never be considered an
official release of ECS. You can browse https://github.com/elastic/ecs/releases[official releases] of ECS.

Please note that when the README.md file and other generated files
(like schema.csv and template.json) are not in agreement,
the README.md should be considered the official spec.
The other two files are simply provided as a convenience, and may not always be
fully up to date.

/////
Working TOC
ECS Intro
Versions
MINI TOC???
Fields (generated)
Base
Agent
Cloud
ETC
* Use Case header = Use Cases (generated)
* Implementing
* About (FAQs)
* Contributing
* GLOSSARY?
* Mike's stuff (https://docs.google.com/document/d/1srylXQDgO0z5rbwho1o8nKIxr9icTTl_nXZGYbElxYs/edit#heading=h.y1xeds3o3jae)
ECS Definitions and Entity Relationships
** ECS data model
** Definitions
** Top-Level Namespaces/Objects
** Reusable Namespaces/Objects
** Assets and Asset Lists
** Pseudonymized and Anonymized Data in ECS
** Threat and Vulnerability Data in ECS
** Revisit inbound, outbound bytes/packets
//////

include::fields.asciidoc[]
include::conventions.asciidoc[]
include::guidelines.asciidoc[]
include::use-cases.asciidoc[]
include::contributing.asciidoc[]
include::glossary.asciidoc[]

23 changes: 23 additions & 0 deletions docs/use-cases.asciidoc
Original file line number Diff line number Diff line change
@@ -0,0 +1,23 @@
[[ecs-use-cases]]
== Use Cases

The power and versatility of {ecs} is best illustrated through use cases.

NOTE: Some use cases contain ECS fields and additional fields which are not
in ECS to describe the full use case. The fields which are not in ECS are in
italic.

* https://github.com/elastic/ecs/blob/master/use-cases/apm.md[APM]
* https://github.com/elastic/ecs/blob/master/use-cases/auditbeat.md[Auditbeat]
* https://github.com/elastic/ecs/blob/master/use-cases/beats.md[Beats]
* https://github.com/elastic/ecs/blob/master/use-cases/filebeat-apache-access.md[Filebeat Apache]
* https://github.com/elastic/ecs/blob/master/use-cases/kubernetes.md[Kubernetes]
* https://github.com/elastic/ecs/blob/master/use-cases/logging.md[Logging]
* https://github.com/elastic/ecs/blob/master/use-cases/metricbeat.md[Metricbeat]
* https://github.com/elastic/ecs/blob/master/use-cases/tls.md[TLS]
* https://github.com/elastic/ecs/blob/master/use-cases/web-logs.md[Parsing web server logs]

We welcome https://github.com/elastic/ecs/blob/master/CONTRIBUTING.md[contributions] of additional ECS uses cases.



0 comments on commit d87a970

Please sign in to comment.