How To Handle The apachesoap:DataHandler Type Using gSOAP

© Catalin Negrila, Friday August 18, 2006

One of the notorious problems of connecting to WebServices implemented in Java from a client written in anything else but Java is the support of the apachesoap:DataHandler. If you've ever tried to do an upload from a C/C++ client to an Axis based WebService, you probably know what I'm talking about.

Recently I had to do just that. I am pretty new to the world of WebServices but googling around and browsing the appropriate forums didn't seem to turn up any solution to my problem but only more people wondering how to solve this exact issue. So upon hacking my way through it I thought a tutorial on how to do this might be helpful to other newbies like me.

Note though that this solution is only relevant if you're using the gSOAP toolkit to generate a client application. If you're using something else, or you want to fake it on the server side, you might still get inspired though to find a solution that applies to your situation.

I used gSOAP 2.7 to parse the WSDL files of the WebService I was connecting to and generate the skeleton code for my client. I've only used gSOAP for less than a week but I think it's a great tool. It sure helped me a lot in taking care of all the nasty and arcane (to me) details of SOAP, WSDL and probably a host of others that I cannot even name.

gSOAP is extremely easy to use and very well documented. However for people that never tried it before I'll try to summarize my usage of it to help the understating of this article. Basically, one needs to run the WSDL file of the desired WebService through one of the gSOAP parsers, wsdl2h that will generate an intermediate header file. You then run a second gSOAP parser, soapcpp2, on that intermediate header file and if all went well you should end up with a few megabytes of automatically generated C or C++ code. You get a client side skeleton that makes calling a WebService and obtaining the results as easy as calling a C function. The same parser can also generate a server side skeleton for that WebService but this is not relevant to the current discussion.

So let's get started:

Step 1: Download and install gSOAP.

Step 2: Download the WSDL of your favorite service using apachesoap:DataHandler to a folder on your local hard drive.

Step 3: Make a copy of typemap.dat from the gSOAP folder in the same folder with the WSDL. You can spread these files around and specify their locations through command line parameters but for simplicity's sake we'll just keep them together. You can also point to the WSDL's URL directly from the command line of the wsdl2h tool but I found it useful to have it handy and peek inside when I encountered issues during the parsing.

Add the apachesoap namespace to your copy of typemap.dat like this:

apachesoap="http://xml.apache.org/xml-soap/"

Once you'll start to actually do something useful with this service you'll probably add more namespaces there as well, but if you're already familiar with gSOAP you probably did that already. I cannot encourage you strongly enough to read the gSOAP User Guide for more info on how to use it, that is outside of the scope of this article.

Step 4: Now you should be ready to parse the WSDL file using gSOAP's wsdl2h and obtain the intermediate header file.

wsdl2h -o MyHeaderFile.h MyWebService.wsdl

You will probably get a warning at this point, something like this:

Warning: no part 'uploadFile' type '"http://xml.apache.org/xml-soap/":DataHandler'
in WSDL definitions '' namespace.

This is not critical at this point though, so the intermediate header file still gets generated correctly.

Step 5 - The Problem: When trying to parse the intermediate header file using the second gSOAP parser, soapcpp2, the thing hits the proverbial fan.

soapcpp2 -I%GSOAP_DIR%/import MyHeaderFile.h

You will get an error like this:

MyHeaderFile.h(3703): parse error

Not too helpful at first. If the WSDL had some other issues like mine did you'll probably have several of these errors popping up for all kinds of different reasons.

If you open MyHeaderFile.h and go to that line, it will look something like this:

// Warning: internal error, undefined qname
//	'"http://xml.apache.org/xml-soap/":DataHandler' for type
//	'apachesoap__DataHandler'
apachesoap__DataHandler _fileContent, /// <<Request parameter

This didn't help me much either but it suggests that if you could somehow define this apachesoap__DataHandler type you might get away with it. Depending on how many such errors soapcpp2 will encounter during parsing the C/C++ skeleton files might or might not get generated. Even if they do, the bindings that use this apachesoap__DataHandler will be unusable.

This whole issue seems to be caused by the fact that DataHandler is a Java specific type and gSOAP doesn't have any C/C++ equivalent for it. However if you look at the SOAP contents of messages containing apachesoap:DataHandler you'll see that there's not much to it actually. The element whose type is DataHandler is represented by only one simple XML tag with some attributes that point to the MIME attachment where the data is actually stored.

Now, I'm pretty new to gSOAP so there might be more elegant or better ways to solve this. I just couldn't find any and this seemed to work fine for me but if you find any better solutions please let me know.

Step 6: Create a file called apachesoap.h. You will need to have this in soapcpp2's import path so either put it in the current folder, or in gSOAP's import folder, or just point to it with the -I command line parameter. You can use ";" to specify multiple paths in there. This file need only contain one type definition:

typedef struct apachesoap__DataHandler {
	@std::string href;
} apachesoap__DataHandler;

You can use a simple char* instead of the STL std::string if you're C only and it will work just the same. Only be careful to free it when you're done. The naming of this data type should match the namespace declaration you included in your typemap.dat for http://xml.apache.org/xml-soap/. If it's something else than apachesoap you'll also need to change the namespace prefix (the word before the double underscores) in the name of the structure to whatever it is that you called it.

This structure defines a simple type that is represented by one XML tag with an attribute. If you would spy on the SOAP messages going back and forth, it will look something like this:

<filecontent href="cid:3289749875439872658376583476390"/>

The @ character before href's definition means that it is an attribute and not a child of the XML node. The href attribute seems to encode the Content ID of the MIME attachment that stores the data that you want to upload.

Step 7: In order for the soapcpp2 parser to actually use this new type you need to import it into the intermediate header file. So, after running wsdl2h, just add the following line to the import section of MyHeaderFile.h, at the top of the file. Remember to make this change each time you run wsdl2h since the header file is completely overwritten each time.

#import "apachesoap.h"

If you run the soapcpp2 tool again on the intermediate header file at this point you should get no more DataHandler related errors. Other WSDL issues might cause similar looking errors. I solved mine with simple search and replace, and by removing some type that wasn't used anywhere but was generating a parse error nonetheless.

Step 8: At this point you'll have a bunch of automatically generated C/C++/header files. I won't go into detail about how to compile those in your project, that's pretty well documented in the gSOAP User's Guide and chances are you're already familiar with it.

Most of the WebServices calls using this gSOAP generated code will be a one line affair, and a short one at that, if you use the generated proxy classes. The verbose version looks like this:

ns1__loginResponse respLogin;
if (soap_call_ns1__login(soap, NULL, NULL, "user", "passwd", respLogin) != SOAP_OK)
	// do some fancy error reporting

The uploadFile call, the one using the DataHandler, is a little bit more involved though. The gSOAP generated code takes care of all the serializing of the apachesoap:DataHandler data type that we have defined. All we are left to do is to setup our data as a MIME attachment before the call and to fill out the href attribute of apachesoap__DataHandler with the correct Content ID. You can find everything about that in gSOAP's User Guide in the MIME Attachments section, but here's how I did it (I omitted error checking for brevity):

char *jpegImage = ...memory buffer containing a jpeg...;
int jpegLen = ...size of the jpeg buffer...;
std::string strContentID = "183648746758934743"; // some unique string

soap_set_mime(&soap, NULL, NULL);
soap_set_mime_attachment(&soap, jpegImage, jpegLen, SOAP_MIME_BINARY,
			"image/jpeg", std::string("<" + strContentID +
			">").c_str(), NULL, NULL);

ns1__uploadResponse respUpload;
apachesoap__DataHandler fileContent;
fileContent.href = std::string("cid:" + strContentID);
soap_call_ns1__uploadFile(&soap, NULL, NULL, "/some/path/on/the/server/",
			fileContent, respUpload));

soap_clr_mime(&soap);

That's all there is to it. :)

We will leave as an exercise to the reader to handle the download of stuff, i.e. receiving apachesoap:DataHandler in a response.