Skip to content

Python script to extract first level domains using 'tld' library

Notifications You must be signed in to change notification settings

jimbobnet/parse-domains

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 

Repository files navigation

parse-domains

Python script to extract first level domains using 'tld' library.


  #!/usr/bin/python3
 
  import fileinput #handles file as argument, STDIN or via pipeline
  from tld import get_fld

  contents = fileinput.input()
  for line in contents:
     domain = get_fld(line, fix_protocol=True, fail_silently=True)
     #fix_protocol, allows “protocol://” prefix. 
     #fail_silently, supress errors if no domain detected (Ips, ...)
     if domain: print(domain) #Strip null/invalid responses

It takes input from the pipeline, via a filename or via STDIN.


$ echo 'http://www.a.f.q.d.n.com' | ./parse-domains.py

$ ./parse-domains.py file-with-URLs.txt

$ ./parse-domains < file-with-URLs.txt

What it does

It takes one line of URL, FQDN, whatever, and spits out the first level domain, or says nothing. Or terrible things happen.

About

Python script to extract first level domains using 'tld' library

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages