Finding a better name for `url.hostname`, `source.hostname` and `destination.hostname`
See original GitHub issueAs most of you have realized, using the word “hostname” for two different purposes in ECS is bugging me. I’m very happy of the resistance I’ve faced when bringing this up so far, because it has forced me to dig deeper and figure this out more precisely. Hopefully this helps me make a better case for why in one of these two cases, “hostname” is incorrect.
Two usages of hostname in ECS
Let’s review the two different uses of “hostname” currently in ECS:
- “hostname” as a server or device name, as given by administrators. Or the value you get when you run the command
hostname
on a host/device.- This use I agree 100% with.
- Currently used this way in
device.hostname
andhost.hostname
. - To complicate the following discussion, some network devices default to returning their main IP address, when they don’t have a textual hostname configured.
- “hostname” as a network address, which may be an IP or a domain name.
- This is the use I object with, and what I want to address here. From most sources I’ve seen, this usage of “hostname” is incorrect, it should be “host”.
- Currently used this way in
source.hostname
,destination.hostname
andurl.hostname
- If some of us thought this was meant to be a place to populate with the hostname of a host under management (e.g. via enrichment):
- This confirms that this is an ambiguous name, because that’s not the actual purpose. The purpose is to store an address prior to having determined whether it’s an IP or a name.
- This concept comes from the common web log format. See Apache’s “%h” log format string and its HostnameLookups directive, or Traefik’s ClientHost and RequestHost
- Enrichment with known host details – if someone wanted to do that – would be done more accurately by nesting a subset of the
host
fields in there instead anyway (e.g.host.id
,host.hostname
).
What’s wrong with hostname
?
In the RFCs I’m aware of, when mentioning “a network address, which may be an IP or a domain name”, they all refer to that as a “host”, not a “hostname”. They sometimes use the word “hostname” (but not always) to refer to an address that’s a registered name (such as a domain name or a local DNS entry).
This is not a recent change either, the oldest RFC I’m linking to below is from 1994, and is using “host” to mean the “ip or name” concept. Here’s a few excerpts:
- 1738 - Uniform Resource Locators (URL) - 3.1. Common Internet Scheme Syntax
- Quoting
- “host: The fully qualified domain name of a network host, or its IP address as a set of four decimal digit groups separated by ‘.’.”
- “http://<host>:<port>/<path>?<searchpart>”
- Quoting
- 2396 - Uniform Resource Identifiers (URI): Generic Syntax - 3.2.2. Server-based Naming Authority
- Quoting
- “host = hostname | IPv4address”
- “<userinfo>@<host>:<port>”
- “server = [ [ userinfo “@” ] hostport ]”
- Quoting
- 3986 - URI Generic Syntax - 3. Syntax Components
- (Check out the section intro for a nice visual of the URI parts, then scroll down to 3.2.2. Host)
- Quoting: “host = IP-literal / IPv4address / reg-name”
- 7230 - HTTP/1.1 Message Syntax and Routing - 5.4. Host
- Quoting: “Host = uri-host [ “:” port ]”
The fact that they are sometimes mentioned in the same places or even in some cases used interchangeably in tool documentation (we’re not the only ones facing this mixup) helps explain why the two are often mixed up. I haven’t looked farther back in the RFCs, so that mixup may indeed come from there.
However as @MikePaquette pointed out, using the name “host” in these places would conflict with our top level field set “host”. Even if “host” is not currently a reuseable object, I do think it could become one, and the most obvious places I would expect to nest it is at source.host
and destination.host
.
So I agree we should not rename these 3 fields to *.host
.
Here are some suggestions for new field names, based on what I’ve seen in the various server documentation tools and the RFCs. Remember that these proposed renames apply only to source
, destination
and url
, not to host
and device
:
host_address
uri_host
orurl_host
- Drop
hostname
from source, destination and url.- In most cases people will want to store the value in their proper field (ip or domain), after determining which type it is. Them wanting to keep the ambiguous field anyway could be considered a use case, and they’re free to name this field however they want, if it’s not in ECS.
- Independent from subpoint above, we could also work on finding the right name for this field after Beta1, and just drop the field out of ECS temporarily for the Beta1 release.
I’m open to other suggestions, of course.
Issue Analytics
- State:
- Created 5 years ago
- Comments:10 (8 by maintainers)
Top GitHub Comments
@webmat thanks, LGTM.
BTW, I will make a case for defining a
related.domain
when we get to discuss #67 which would be an array containing a copy of whatever we populatesource.domain
,destination.domain
, andurl.domain
with.@ruflin Would you be good for this version of the proposal for Beta1? https://github.com/elastic/ecs/issues/166#issuecomment-436349290