Quantcast
Channel: Windows PowerShell forum
Viewing all articles
Browse latest Browse all 21975

How do I search a spreadsheet for a list of URL's | then search those URL's for keywords | save the output? (Oh yeah..., and schedule it)

$
0
0

Fist, I should mention I am not a programmer but am eagerly learning powershell!

I am looking for an automated solution to accomplish what I am currently doing manually.  I need a script that would combine the following:

  1. Reach out to a list of websites (probably a loop of some sort since the list will come out of a spreadsheet which could contain 1 or 100 different sites)
  2. Search each page for a specific word or words (not contained in the spreadsheet though that may make it more scalable)
  3. Save the URL of the site(s) that contained the keywords to one text file (versus the multiple .html files I am creating today)
  4. Have the output contain which words it found on which site.
  5. If not overly complicated, I would like to schedule this to recur once a week.

A working script would be ideal, but even the resources that show me how to incorporate each element would suffice.

I have had success pulling down the full content of the listed pages and saving them to a directory, which requires manual intervention.

So far this works, but it's not scalable:
     Set-ExecutionPolicy RemoteSigned
     $web = New-Object Net.WebClient
     $web.DownloadString("http://sosomesite/54321.com") | Out-File "C:\savestuffhere\54321.html"
     $web.DownloadString("http://sosomesite/54321.com") | Out-File "C:\savestuffhere\65432.html"
     Get-ChildItem -Path "C:\savestuffhere\" -Include *.html -Recurse | Select-String -Pattern "Keyword 1"

In otherwords, I have to manually replace the "http://sosomesite/54321.com" and "C:\savestuffhere\54321.html" when the URL changes to .\65432.com and the output name to match.  That works fine when it's a couple sites, but again, is not scalable.  

Then, to see if any of the saved file's contain the keyword(s), I have to search the directory for the keyword which I am using:
Get-ChildItem -Path "C:\savestuffhere\54321.html" -Include *.html -Recurse | Select-String -Pattern "Keyword 1"


Viewing all articles
Browse latest Browse all 21975

Trending Articles



<script src="https://jsc.adskeeper.com/r/s/rssing.com.1596347.js" async> </script>