Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Generated geo info is bogus #115

Open
cavokz opened this issue Nov 23, 2022 · 4 comments
Open

Generated geo info is bogus #115

cavokz opened this issue Nov 23, 2022 · 4 comments
Labels
bug Something isn't working

Comments

@cavokz
Copy link
Collaborator

cavokz commented Nov 23, 2022

geo fields contain totally random data. Ex:

"geo": {
  "continent_name": "uTe",
  "region_iso_code": "Kyp",
  "city_name": "uvJ",
  "country_iso_code": "biz",
  "country_name": "buM",
  "location": [
    0,
    0
  ],
  "region_name": "CPr"
}
@cavokz cavokz added the bug Something isn't working label Nov 23, 2022
@cavokz
Copy link
Collaborator Author

cavokz commented Nov 23, 2022

Geneve is not able to generate meaningful data for these fields, especially geo.location which is always [0,0].

Things can be improved with an external mechanism that uses the ip address as key in a geo ip database (as the following Elasticsearch ingest pipeline) though this would not help when no geo info is mapped to the given ip address.

{
  "description": "Add geoip info",
  "processors": [
    {
      "geoip": {
        "field": "client.ip",
        "target_field": "client.geo",
        "ignore_missing": true
      }
    },
    {
      "geoip": {
        "field": "source.ip",
        "target_field": "source.geo",
        "ignore_missing": true
      }
    },
    {
      "geoip": {
        "field": "destination.ip",
        "target_field": "destination.geo",
        "ignore_missing": true
      }
    },
    {
      "geoip": {
        "field": "server.ip",
        "target_field": "server.geo",
        "ignore_missing": true
      }
    },
    {
      "geoip": {
        "field": "host.ip",
        "target_field": "host.geo",
        "ignore_missing": true
      }
    }
  ]
}

A limited workaround is to express constraints in the data model, ex:

...
source.geo.city_name in ("New York", "London", "Rome") and
source.geo.continent_name in ("Europe", "Asia", "Africa")
...

but they would be freely combined in plainly wrong associations, ex:

"geo": {
  "continent_name": "Asia",
  "city_name": "Rome"
}

@stephmilovic
Copy link

stephmilovic commented Jan 25, 2023

I found this comment in slack that sometimes the geo processor needs to force download. I added this to the top of my network script (with a 2 second sleep because idk how to make it async and it needs to finish before running everything else). geo pipeline is now running as expected 🎉

curl -fs -XPUT -H "Content-Type: application/json" $TEST_ELASTICSEARCH_URL/_cluster/settings --data @- <<EOF
{
  "transient": {
    "ingest": {
      "geoip": {
        "downloader": {
          "enabled": "true"
        }
      }
    }
  }
}
EOF
sleep 2

@cavokz
Copy link
Collaborator Author

cavokz commented Jan 25, 2023

I found this comment in slack that sometimes the geo processor needs to force download. I added this to the top of my network script (with a 2 second sleep because idk how to make it async and it needs to finish before running everything else). geo pipeline is now running as expected 🎉

Interesting! Thanks.

@stephmilovic
Copy link

My script above is unreliable with the sleep. For anyone encountering this problem, quickest way I have found to resolve it is in Dev Tools by toggling the geoip.downloader.enabled value:

PUT _cluster/settings
{
  "transient": {
    "ingest": {
      "geoip": {
        "downloader": {
          "enabled": "false"
        }
      }
    }
  }
}

PUT _cluster/settings
{
  "transient": {
    "ingest": {
      "geoip": {
        "downloader": {
          "enabled": "true"
        }
      }
    }
  }
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

No branches or pull requests

2 participants