Class: WebBrowser

Inherits:
Object
  • Object
show all
Defined in:
rlib/webbrowser.rb

Overview

WebBrowser class, derived from Rob Burrowes' Wikarekare source (MIT License). Encapsulates simple methods to log into a web site, and pull pages.

Instance Attribute Summary (collapse)

Class Method Summary (collapse)

Instance Method Summary (collapse)

Constructor Details

- (WebBrowser) initialize(host)

Create a WebBrowser instance

Parameters:

  • host (String)

    the host we want to connect to



21
22
23
24
25
# File 'rlib/webbrowser.rb', line 21

def initialize(host)
  @host = host  #Need to do this, as passing nil is different to passing nothing to initialize!
  @cookies = nil
  @debug = false
end

Instance Attribute Details

Returns the value of attribute cookie



13
14
15
# File 'rlib/webbrowser.rb', line 13

def cookie
  @cookie
end

- (Object) debug

Returns the value of attribute debug



16
17
18
# File 'rlib/webbrowser.rb', line 16

def debug
  @debug
end

- (Object) host (readonly)

Returns the value of attribute host



11
12
13
# File 'rlib/webbrowser.rb', line 11

def host
  @host
end

- (Object) page (readonly)

Returns the value of attribute page



14
15
16
# File 'rlib/webbrowser.rb', line 14

def page
  @page
end

- (Object) referer

Returns the value of attribute referer



15
16
17
# File 'rlib/webbrowser.rb', line 15

def referer
  @referer
end

- (Object) session

Returns the value of attribute session



12
13
14
# File 'rlib/webbrowser.rb', line 12

def session
  @session
end

Class Method Details

+ (Object) http_session(host, port = 80) {|the| ... }

Create a WebBrowser instance, connect to the host via http, and yield the WebBrowser instance.

Automatically closes the http session on returning from the block passed to it.

Parameters:

  • host (String)

    the host we want to connect to

  • port (Fixnum) (defaults to: 80)

    (80) the port the remote web server is running on

  • block (Proc)

Yield Parameters:

  • the (WebBrowser)

    session descriptor for further calls.



33
34
35
36
37
38
# File 'rlib/webbrowser.rb', line 33

def self.http_session(host, port = 80)
  wb = self.new(host)
  wb.http_session(port) do
    yield wb
  end
end

+ (Object) https_session(host, port = 443) {|the| ... }

Create a WebBrowser instance, connect to the host via https, and yield the WebBrowser instance.

Automatically closes the http session on returning from the block passed to it.

Parameters:

  • host (String)

    the host we want to connect to

  • port (Fixnum) (defaults to: 443)

    (443) the port the remote web server is running on

  • block (Proc)

Yield Parameters:

  • the (WebBrowser)

    session descriptor for further calls.



46
47
48
49
50
51
# File 'rlib/webbrowser.rb', line 46

def self.https_session(host, port=443)
  wb = self.new(host)
  wb.https_session(port) do
    yield wb
  end
end

Instance Method Details

- (Hash) extract_input_fields(body)

Extract form field values from the html body.

Parameters:

  • body (String)

    The html response body

Returns:

  • (Hash)

    Keys are the field names, values are the field values



202
203
204
205
206
207
208
209
# File 'rlib/webbrowser.rb', line 202

def extract_input_fields(body)
  entry = true
  @inputs = {}
  doc = Nokogiri::HTML(body)
  doc.xpath("//form/input").each do |f|
    @inputs[f.get_attribute('name')] = f.get_attribute('value')
  end
end

Extract links from the html body.

Parameters:

  • body (String)

    The html response body

Returns:

  • (Hash)

    Keys are the link text, values are the html links



214
215
216
217
218
219
220
221
222
# File 'rlib/webbrowser.rb', line 214

def extract_link_fields(body)
  entry = true
  @inputs = {}
  doc = Nokogiri::HTML(body)
  doc.xpath("//a").each do |f|
    return URI.parse( f.get_attribute('href') ).path if(f.get_attribute('name') == 'URL$1')
  end
  return nil
end

- (String) form_values_to_s(form_values = nil, has_q = false)

Take a hash of the params to the post and generate a safe URL string.

Parameters:

  • form_values (Hash) (defaults to: nil)

    Keys are the field names, values are the field values

  • has_q (Boolean) (defaults to: false)

    We have a leading ? for the html get, so don't need to add one.

Returns:

  • (String)

    The 'safe' text for fields the get or post query to the web server



228
229
230
231
232
233
234
235
236
237
238
# File 'rlib/webbrowser.rb', line 228

def form_values_to_s(form_values=nil, has_q = false)
  return "" if form_values == nil
  s = (has_q == true ? "" : "?")
  first = true
  form_values.each do |key,value|
    s += "&" if !first
    s += "#{URI.escape(key)}=#{URI.escape(value)}"
    first = false
  end
  return s
end

- (String) get_page(query, form_values = nil)

send the query to the web server using an http get, and returns the response.

Cookies in the response get preserved in @cookie, so they will be sent along with subsequent calls
We are currently ignoring redirects from the PDU's we are querying.

Parameters:

  • query (String)

    The URL after the host/ bit and not usually not including parameters, if form_values are passed in

  • form_values (Hash{String=>Object-with-to_s}) (defaults to: nil)

    The parameter passed to the web server eg. ?key1=value1&key2=value2…

Returns:

  • (String)

    The Net::HTTPResponse.body text response from the web server



88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
# File 'rlib/webbrowser.rb', line 88

def get_page(query,form_values=nil)
  query += form_values_to_s(form_values, query.index('?') != nil) #Should be using req.set_form_data, but it seems to by stripping the leading / and then the query fails.
  #$stderr.puts query
  url = @ssl ? URI.parse("https://#{@host}/#{query}") : URI.parse("http://#{@host}/#{query}")
  $stderr.puts url if @debug
  req = Net::HTTP::Get.new(url.path)    
  header = {'User-Agent' => 'Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.1.5) Gecko/20091102 Firefox/3.5.5', 'Content-Type' => 'application/x-www-form-urlencoded'}    
  header['Cookie'] = @cookie if @cookie != nil
  $stderr.puts header['Cookie']  if debug
  req.initialize_http_header( header )

  response = @session.request(req)
  if(response.code.to_i != 200)

    if(response.code.to_i == 302)
        #ignore the redirects.
        #$stderr.puts "302"
        #response.each {|key, val| $stderr.printf "%s = %s\n", key, val }  #Location seems to have cgi params removed. End up with .../cginame?&
        #$stderr.puts "Redirect to #{response['location']}"   #Location seems to have cgi params removed. End up with .../cginame?&
        if (response_text = response.response['set-cookie']) != nil
          @cookie =  response_text
        else
          @cookie = ''
        end
        #$stderr.puts
      return
    end
    raise "#{response.code} #{response.message}"
  end

  if (response_text = response.response['set-cookie']) != nil
    @cookie =  response_text
  else
    @cookie = ''
  end

  return response.body
end

- (Object) http_session(port = 80) {|| ... }

Creating a session for http connection

attached block would then call get or post NET::HTTP calls

Parameters:

  • port (Fixnum) (defaults to: 80)

    Optional http server port

  • block (Proc)

Yield Parameters:

  • (Net::HTTP)


58
59
60
61
62
63
64
65
# File 'rlib/webbrowser.rb', line 58

def http_session(port = 80)
  @http = Net::HTTP.new(@host, port)   
  @ssl = @http.use_ssl = false       
  @http.start do |session| #ensure we close the session after the block
    @session = session 
    yield
  end
end

- (Object) https_session(port = 443) {|| ... }

Creating a session for https connection

attached block would then call get or post NET::HTTP calls

Parameters:

  • port (Fixnum) (defaults to: 443)

    Optional http server port

  • block (Proc)

Yield Parameters:

  • (Net::HTTP)


72
73
74
75
76
77
78
79
80
# File 'rlib/webbrowser.rb', line 72

def https_session(port = 443)
  @http = Net::HTTP.new(@host, port)   
  @ssl = @http.use_ssl = true        #Use https. Doesn't happen automatically!
  @http.verify_mode = OpenSSL::SSL::VERIFY_NONE  #ignore that this is not a signed cert. (as they usually aren't in embedded devices)
  @http.start do |session| #ensure we close the session after the block
    @session = session
    yield
  end
end

- (String) post_page(query, form_values = nil)

send the query to the web server using an http post, and returns the response.

Cookies in the response get preserved in @cookie, so they will be sent along with subsequent calls
We are currently ignoring redirects from the PDU's we are querying.

Parameters:

  • query (String)

    The URL after the host/ bit and not usually not including parameters, if form_values are passed in

  • form_values (Hash{String=>Object-with-to_s}) (defaults to: nil)

    The parameter passed to the web server eg. ?key1=value1&key2=value2…

Returns:

  • (String)

    The Net::HTTPResponse.body text response from the web server



133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
# File 'rlib/webbrowser.rb', line 133

def post_page(query,form_values=nil)
  #query += form_values_to_s(form_values) #Should be using req.set_form_data, but it seems to by stripping the leading / and then the query fails.
 #$stderr.puts query
  url = @ssl ? URI.parse("https://#{@host}/#{query}") : URI.parse("http://#{@host}/#{query}")
  $stderr.puts url if @debug
  req = Net::HTTP::Post.new(url.path)
  header = {'User-Agent' => 'Mozilla/5.0 (Macintosh; U; Intel Mac OS X 10.6; en-US; rv:1.9.1.5) Gecko/20091102 Firefox/3.5.5', 'Content-Type' => 'application/x-www-form-urlencoded'}    
  header['Cookie'] = @cookie if @cookie != nil
  $stderr.puts header['Cookie']  if debug
  req.initialize_http_header( header )
  req.set_form_data(form_values, '&') if form_values != nil

    response = @session.request(req)
    if(response.code.to_i != 200)
      if(response.code.to_i == 302)
          #ignore the redirects. 
          #$stderr.puts "302"
          #response.each {|key, val| $stderr.printf "%s = %s\n", key, val }  #Location seems to have cgi params removed. End up with .../cginame?&
          #$stderr.puts "Redirect of Post to #{response['location']}" #Location seems to have cgi params removed. End up with .../cginame?&
          if (response_text = response.response['set-cookie']) != nil
            @cookie =  response_text
          else
            @cookie = ''
          end
          #$stderr.puts
        return
      end
      raise "#{response.code} #{response.message}"
    end

    if (response_text = response.response['set-cookie']) != nil
      @cookie =  response_text
    else
      @cookie = ''
    end

    @response = response
    
    return response.body
end