Fist, I should mention I am not a programmer but am eagerly learning powershell!
I am looking for an automated solution to accomplish what I am currently doing manually. I need a script that would combine the following:
- Reach out to a list of websites (probably a loop of some sort since the list will come out of a spreadsheet which could contain 1 or 100 different sites)
- Search each page for a specific word or words (not contained in the spreadsheet though that may make it more scalable)
- Save the URL of the site(s) that contained the keywords to one text file (versus the multiple .html files I am creating today)
- Have the output contain which words it found on which site.
- If not overly complicated, I would like to schedule this to recur once a week.
A working script would be ideal, but even the resources that show me how to incorporate each element would suffice.
I have had success pulling down the full content of the listed pages and saving them to a directory, which requires manual intervention.
So far this works, but it's not scalable:
Set-ExecutionPolicy RemoteSigned
$web = New-Object Net.WebClient
$web.DownloadString("http://sosomesite/54321.com") | Out-File "C:\savestuffhere\54321.html"
$web.DownloadString("http://sosomesite/54321.com") | Out-File "C:\savestuffhere\65432.html"
Get-ChildItem -Path "C:\savestuffhere\" -Include *.html -Recurse | Select-String -Pattern "Keyword 1"
In otherwords, I have to manually replace the "http://sosomesite/54321.com" and "C:\savestuffhere\54321.html" when the URL changes to .\65432.com and the output name to match. That works fine when it's a couple sites, but again, is not scalable.
Then, to see if any of the saved file's contain the keyword(s), I have to search the directory for the keyword which I am using:
Get-ChildItem -Path "C:\savestuffhere\54321.html" -Include *.html -Recurse | Select-String -Pattern "Keyword 1"