parse-domains

Python script to extract first level domains using 'tld' library.


  #!/usr/bin/python3
 
  import fileinput #handles file as argument, STDIN or via pipeline
  from tld import get_fld

  contents = fileinput.input()
  for line in contents:
     domain = get_fld(line, fix_protocol=True, fail_silently=True)
     #fix_protocol, allows “protocol://” prefix. 
     #fail_silently, supress errors if no domain detected (Ips, ...)
     if domain: print(domain) #Strip null/invalid responses

It takes input from the pipeline, via a filename or via STDIN.


$ echo 'http://www.a.f.q.d.n.com' | ./parse-domains.py

$ ./parse-domains.py file-with-URLs.txt

$ ./parse-domains < file-with-URLs.txt

It takes one line of URL, FQDN, whatever, and spits out the first level domain, or says nothing. Or terrible things happen.

Name		Name	Last commit message	Last commit date
Latest commit History 6 Commits
README.md		README.md
parse-domains.png		parse-domains.png
parse-domains.py		parse-domains.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

parse-domains

About

Releases

Packages

Languages

jimbobnet/parse-domains

Folders and files

Latest commit

History

Repository files navigation

parse-domains

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages