There seem to be growing interest in Python in the R cummunity. While there can be a range of opinions about using R over Python (or vice versa) for exploratory data analysis, fitting statistical/machine learning algorithms and so on, I consider one of the strongest attractions of using Python comes from the fact that Python is a general purpose programming language. As more developers are involved in, it can provide a way to get jobs done easily, which can be tricky in R. In this article, an example is introduced by illustrating how to connect to SOAP (Simple Object Access Protocol) web services.

Web service (or API) is a popular way to connect to a server programmatically and SOAP web service is one type. For those who are interested in it, please see this article. Although R has good packages to connect to a newer type of web service, which is based on REST (Representational state transfer) (eg, httr package), I haven’t found a good R package that can be used as a comprehensive SOAP client, which means I have to use the RCurl package at best. On the other hand, as ‘batteries included’, one of Python’s philosophies, assures, it has a number of SOAP client libraries. Among those, I’ve chosen the suds library.

In this demo, I’m going to connect to Sizmek MDX API where online campaign data can be pulled from it. I’ve used the PyDev plugin of Eclipse and the source of this demo can be found in my GitHub repo. It has 4 classes that connect to the API (Authentication, Advertiser, ConvTag and Campaign) and they are kept in the sizmek package. Also 2 extra classes are set up in the utils package (Soap and Helper), which keep common methods for the 4 classes. The advertiser class can be seen as following.

 1from utils.soap import Soap
 2from datetime import datetime
 3
 4class Advertiser:
 5    def __init__(self, pid, name, vertical, useConv):
 6        '''
 7        Constructor
 8        '''
 9        self.id = pid
10        self.name = name
11        self.vertical = vertical
12        self.useConv = useConv
13        self.addedDate = datetime.now().date()
14        
15    def __repr__(self):
16        return "id: %s|name: %s|use conv: %s" % (self.id, self.name, self.useConv)
17    
18    def __str__(self):
19        return "id: %s|name: %s|use conv: %s" % (self.id, self.name, self.useConv)
20    
21    def __len__(self):
22        return 1
23    
24    @staticmethod
25    def GetItemRes(wurl, auth, pageIndex, pageSize, showExtInfo=True):
26        client = Soap.SetupClient(wurl, auth, toAddToken=True, toImportMsgSrc=True, toImportArrSrc=False)
27        # update paging info
28        paging = client.factory.create('ns1:ListPaging')
29        paging['PageIndex'] = pageIndex
30        paging['PageSize'] = pageSize
31        # update filter array - empty
32        filterArrary = client.factory.create('ns0:ArrayOfAdvertiserServiceFilter')
33        # get response
34        response = client.service.GetAdvertisers(filterArrary, paging, showExtInfo)
35        return response
36    
37    @staticmethod
38    def GetItem(response):
39        objList = []
40        for r in response[1]['Advertisers']['AdvertiserInfo']:
41            obj = Advertiser(r['ID'], r['AdvertiserName'], r['Vertical'], r['AdvertiserExtendedInfo']['UsesConversionTags'])
42            objList.append(obj)
43        return objList
44    
45    @staticmethod
46    def GetItemPgn(wurl, auth, pageIndex, pageSize, showExtInfo=True):
47        objList = []
48        cond = True
49        while cond:
50            response = Advertiser.GetItemRes(wurl, auth, pageIndex, pageSize, showExtInfo)
51            objList = objList + Advertiser.GetItem(response)
52            Soap.ShowProgress(response[1]['TotalCount'], len(objList), pageIndex, pageSize)
53            if len(objList) < response[1]['TotalCount']:
54                pageIndex += 1
55            else:
56                cond = False
57        return objList
58            
59    @staticmethod
60    def GetFilter(objList):
61        filteredList = [obj for obj in objList if obj.useConv == True]
62        print "%s out of %s advertiser where useConv equals True" % (len(filteredList), len(objList))
63        return filteredList

The 4 classes have a number of common methods (GetItemRes(), GetItem(), GetItemPgn(), GetFilter()) to retrieve data from the relevant sections of the API and these methods are not related to an instance of the classes so that they are set to be static (@staticmethod). In R, this class may be constructed as following.

 1advertiser = function(pid, name, vertical, useConv) {
 2  out <- list()
 3  out$id = pid
 4  out$name = name
 5  out$vertical = vertical
 6  out$useConv = useConv
 7  out$addedDate = Sys.Date()
 8  class(out) <- append(class(out), 'Advertiser')
 9}
10
11GetItemRes <- function(obj) {
12  UseMethod('GetItemRes', obj)
13}
14
15GetItemRes.default <- function(obj) {
16  warning('Default GetItemRes method called on unrecognized object.')
17  obj
18}
19
20GetItemRes.Advertiser <- function(obj) {
21  response <- 'Get response from the API'
22  response
23}
24
25...

While it is relatively straightforward to set up corresponding S3 classes, the issue is that there is no comprehensive SOAP client in R. In Python, the client library helps create a proxy class based on the relevant WSDL file so that a request/response can be handled entirely in a ‘Pythonic’ way. For example, below shows how to retrieve advertiser details from the API.

 1from utils.helper import Helper
 2from sizmek.authentication import Auth
 3from sizmek.advertiser import Advertiser
 4import logging
 5
 6logging.basicConfig(level=logging.INFO)
 7logging.getLogger('suds.client').setLevel(logging.DEBUG)
 8
 9## WSDL urls
10authWSDL = 'https://platform.mediamind.com/Eyeblaster.MediaMind.API/V2/AuthenticationService.svc?wsdl'
11advertiserWSDL = 'https://platform.mediamind.com/Eyeblaster.MediaMind.API/V2/AdvertiserService.svc?wsdl'
12campaignWSDL = 'https://platform.mediamind.com/Eyeblaster.MediaMind.API/V2/CampaignService.svc?wsdl'
13
14## credentials
15username = 'user-name'
16password = 'password'
17appkey = 'application-key'
18
19## path to export API responses
20path = 'C:\\projects\\workspace\\sizmek_report\\src\\csvs\\'
21
22## authentication
23auth = Auth(username, password, appkey, authWSDL)
24
25## get advertisers
26advRes = Advertiser.GetItemRes(advertiserWSDL, auth, pageIndex=0, pageSize=50, showExtInfo=True)
27advList = Advertiser.GetItem(advRes)
28Helper.PrintObjects(advList)
29
30#response example
31#id: 78640|name: Roses Only|use conv: True
32#id: 79716|name: Knowledge Source|use conv: True
33#id: 83457|name: Gold Buyers|use conv: True

On the other hand, if I use the RCurl package, I have to send the following SOAP message by hacking the relevant WSDL file and its XML response has to be parsed accordingly. Although it is possible, life will be a lot harder.

 1<?xml version="1.0" encoding="UTF-8"?>
 2<SOAP-ENV:Envelope xmlns:ns0="http://api.eyeblaster.com/message" xmlns:ns1="http://api.eyeblaster.com/message" xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" xmlns:SOAP-ENV="http://schemas.xmlsoap.org/soap/envelope/">
 3   <SOAP-ENV:Header>
 4      <ns1:UserSecurityToken>token-generated-from-authentication-service</ns1:UserSecurityToken>
 5   </SOAP-ENV:Header>
 6   <ns1:Body xmlns:ns1="http://schemas.xmlsoap.org/soap/envelope/">
 7      <GetAdvertisersRequest xmlns="http://api.eyeblaster.com/message">
 8         <Paging>
 9            <PageIndex>0</PageIndex>
10            <PageSize>50</PageSize>
11         </Paging>
12         <ShowAdvertiserExtendedInfo>true</ShowAdvertiserExtendedInfo>
13      </GetAdvertisersRequest>
14   </ns1:Body>
15</SOAP-ENV:Envelope>
16DEBUG:suds.client:headers = {'SOAPAction': '"http://api.eyeblaster.com/IAdvertiserService/GetAdvertisers"', 'Content-Type': 'text/xml; charset=utf-8'}

I guess most R users are not programmers but many of them are quite good at understanding how a program works. Therefore, if there is an area that R is not strong, it’d be alright to consider another language to make life easier. Among those, I consider Python is easy to learn and it can provide a range of good tools. If you’re interested, please see my next article about some thoughts on Python.