You may download a
free trial version of the WebRobot v1.1 component here,
to test this project.
Interacting with sites that use JavaScript to generate
content has, until now, been either very complex, or almost impossible. This
tutorial will demonstrate the usage of the WebRobot v1.1 component to interact
with the social bookmarking site digg, which employs JavaScript heavily to
generate the displayed content, and to interact with it.
First, we will create our instance of the WebRobot
component, and enable AJAX mode:
Private
wrobot As New
foxtrot.xray.WebRobot
Private
Sub Form1_Load(ByVal
sender As System.Object, _
ByVal e As
System.EventArgs) Handles
MyBase.Load
wrobot.AJAX =
True
End
Sub
Private
Sub Form1_Closing(ByVal
sender As Object,
_
ByVal e As
System.ComponentModel.CancelEventArgs) _
Handles MyBase.Closing
wrobot.Dispose()
End
Sub
We created our instance of the WebRobot, enabled AJAX
mode, and then, on the Closing event of our form, we called the Dispose metod to
release all resources. Now, we will log in to digg:
'Load the
main digg page
wrobot.LoadPage("http://digg.com")
'Get the
login form
Dim
loginform As
foxtrot.xray.Form = wrobot.GetFormByContainsAction("login")
'Username
field
Dim
userfield As
foxtrot.xray.Input.Text = loginform.Fields(0)
'Password
field
Dim
pswdfield As
foxtrot.xray.Input.Password = loginform.Fields(1)
'Submit
button
Dim
sbmtfield As
foxtrot.xray.Input.Submit = loginform.Fields(3)
userfield.Value = username
pswdfield.Value = password
'Simulate a
click on the submit button
sbmtfield.Click()
After loading the main page, and filling out the login
form, we clicked on the submit button. We could have used the WebRobot's
SubmitForm method, but since this page may use JavaScript for form and button
events, it would be safer to just simulate a click, so that any code gets
interpreted. The Click event blocks until all actions are performed and any
necessary page navigation is complete.
Now, we can start parsing through the main page
content, to detect all the news items displayed. The WebRobot v1.1 component has
an Element object and a FindElements method that allow sifting through the page.
The Event object also exposes a Click method, to allow clicking on the elements
you find after parsing. Let's look for news items:
Dim
newsitems As New
System.Collections.ArrayList
'Get the list
of DIV elements on the web page
Dim
elements() As foxtrot.xray.Element =
wrobot.FindElements("div")
For
Each item As
foxtrot.xray.Element In elements
'Remove the CR and LF characters at the start of the
element that the
'digg
html source contains
Dim text As
String = item.Text.TrimStart(vbCrLf.ToCharArray()).ToLower
'Look for
DIVs of news-summary class
If
(text.IndexOf("<div class=news-summary") = 0)
Then
newsitems.Add(item)
End
If
Next
Now, we
have the DIVs containing our news items. Note the use of the Text property of
the elements to search for the class of the DIV.
Now that we have our list of DIVs, we will parse the
content from them:
For
Each newsitem As
foxtrot.xray.Element In newsitems
'Object to
store parsed article info
Dim
artinfo As New
ArticleInfo
'Get the H3s
in the item, to look for the title
Dim
titledata() As foxtrot.xray.Element =
newsitem.FindElements("H3")
'The first H3
contains the title, now find the A HREF containing
'the news
link
Dim
urldata() As foxtrot.xray.Element =
titledata(0).FindElements("A")
'The first A
HREF found contains the news link
Dim
ahref As String
= urldata(0).Text
'Regular
expression to get the URL and the title of the story
Dim
parser As New
_
System.Text.RegularExpressions.Regex("href=""(.*)"".*>(.*)</",
_
System.Text.RegularExpressions.RegexOptions.IgnoreCase
Or _
System.Text.RegularExpressions.RegexOptions.Singleline)
'Store the URL and title
artinfo.URL =
parser.Matches(ahref).Item(0).Groups.Item(1).Value
artinfo.Title =
parser.Matches(ahref).Item(0).Groups.Item(2).Value
'More parsing
code follows
.
.
.
Next
We found the URL and title of the story by searching
within the DIV. Now, we will find the amount of diggs, the digg This! link, and
the digg discussion for each news item:
'The amount
of diggs is contained in a STRONG element. Find the one
'with a class
that matches diggs-strong-
Dim
digginfo() As foxtrot.xray.Element =
newsitem.FindElements("strong")
For
Each item As
foxtrot.xray.Element In digginfo
Dim
text As String
= item.Text.TrimStart(vbCrLf.ToCharArray()).ToLower
If
(text.IndexOf("<strong id=diggs-strong-") =
0) Then
parser =
New System.Text.RegularExpressions.Regex(">(.*)</",
_
System.Text.RegularExpressions.RegexOptions.IgnoreCase
Or _
System.Text.RegularExpressions.RegexOptions.Singleline)
'Store the
diggs count
artinfo.Diggs = _
Integer.Parse(parser.Matches(text).Item(0).Groups.Item(1).Value)
End
If
Next
'The digg
this! link and the digg discussion links are stored in A HREFs
urldata =
newsitem.FindElements("A")
For
Each item As
foxtrot.xray.Element In urldata
If
(item.Text.IndexOf("digg it") > -1)
Then
'If item
contains digg it, it's the digg this! link.
'If the user
has already dugg the item, this link will
'not be
present. If present, we will store the Element
'object to
simulate a click
artinfo.DiggLink = item
ElseIf
(item.Text.IndexOf("class=more") > -1)
Then
'If the A
HREF class is more, then this is the digg discussion link
parser =
New System.Text.RegularExpressions.Regex("href=""(.*)"".*>(.*)</",
_
System.Text.RegularExpressions.RegexOptions.IgnoreCase
Or _
System.Text.RegularExpressions.RegexOptions.Singleline)
artinfo.DiggMore =
parser.Matches(item.Text).Item(0).Groups.Item(1).Value
End
If
Next
'Create a new
item for the main article ListView
Dim
litem As New
ListViewItem(artinfo.Title)
'Store the
article info in the articlelist HashTable
articlelist(litem) =
artinfo
ListView1.Items.Add(litem)
We have
populated our form with the article info. Now, we add code to load a web browser
instance with the link story that was clicked on:
Private
Sub ListView1_DoubleClick(ByVal
sender As Object,
_
ByVal
e As System.EventArgs)
Handles ListView1.DoubleClick
'Are there any selected items?
If (ListView1.SelectedItems.Count > 0)
Then
'Get the article info related to the selected item
Dim item As
ListViewItem = ListView1.SelectedItems(0)
Dim artinfo As
ArticleInfo = articlelist(item)
'Launch a new web browser instance with the URL
System.Diagnostics.Process.Start(artinfo.URL)
End If
End
Sub
Now, we create a context menu,
to be displayed whenever the user right-clicks on an article. This context menu
will show the amount of diggs (in MenuItem1), enable the user to digg the story
(in MenuItem2), and also launch a browser instance with the digg discussion (in
MenuItem3). First, we will add code to update the digg count and wether the news
item has been dugg or not:
Private
Sub ListView1_Click(ByVal
sender As Object,
_
ByVal e As
System.EventArgs) Handles ListView1.Click
'Is there a selected item?
If (ListView1.SelectedItems.Count > 0)
Then
'Enable the context menu
ListView1.ContextMenu = ContextMenu1
'Get the article info related to the selected item
Dim item As
ListViewItem = ListView1.SelectedItems(0)
Dim artinfo As
ArticleInfo = articlelist(item)
'Update digg count
MenuItem1.Text =
artinfo.Diggs.ToString & " Diggs"
'Can we digg this item?
If (artinfo.DiggLink
Is Nothing)
Then
'Item already dugg
MenuItem2.Text
= "Dugg!"
MenuItem2.Enabled = False
Else
'We can dig this item
MenuItem2.Text
= "Digg this!"
MenuItem2.Enabled = True
End If
Else
'Disable the context menu
ListView1.ContextMenu = Nothing
End If
End
Sub
Now, we can add code to digg a news item:
Private
Sub MenuItem2_Click(ByVal
sender As System.Object, _
ByVal e As
System.EventArgs) Handles MenuItem2.Click
'Is there a selected item?
If (ListView1.SelectedItems.Count > 0)
Then
ListView1.ContextMenu = ContextMenu1
'Get the article info related to the selected item
Dim item As
ListViewItem = ListView1.SelectedItems(0)
Dim artinfo As
ArticleInfo = articlelist(item)
'Are we sure we can digg this item?
If Not (artinfo.DiggLink
Is Nothing)
Then
'Simulate
a click on the digg this! link, which
'contains JavaScript code, but no valid HREF
artinfo.DiggLink.Click()
'Clear this item so that we cannot try to digg it
'again, update digg count, and update the user
'interface
artinfo.DiggLink = Nothing
artinfo.Diggs
+= 1
MenuItem2.Text
= "Dugg!"
MenuItem2.Enabled = False
MenuItem1.Text
= artinfo.Diggs.ToString & " Diggs"
End If
End If
End
Sub
Finally, we add code to load a browser window with the digg discussion link:
Private
Sub MenuItem3_Click(ByVal
sender As System.Object, _
ByVal e As
System.EventArgs) Handles MenuItem3.Click
'Are there any selected items?
If (ListView1.SelectedItems.Count > 0)
Then
'Get
the article info related to the selected item
Dim item As
ListViewItem = ListView1.SelectedItems(0)
Dim artinfo As
ArticleInfo = articlelist(item)
'Launch a new web browser instance with the digg
discussion
System.Diagnostics.Process.Start(artinfo.DiggMore)
End If
End
Sub
We have interacted with digg,
simulating a real user clicking on links. Short of captchas, there is no way for
a web application to know that it's not a real user at the helm.