Tuesday, August 31, 2010

Carriage return being stripped when using XSLCompiledTransform - simple XSL fix to prevent it

One of the most interesting and annoying parts about doing rewrites of .net 1.0 to .net 3.5 is discovering simple changes under the hood, have the potential to drive you nuts! XML serialization has been my personal cross as I do tasks such as transforming a message from XML to text using an XSLT. Now, you'd think that this would be simply changing from XslTransform to XslCompiledTransform. Unfortunately it's not that straight forward.


Now there are perfectly good reasons why Microsoft made the changes. They even kindly put up documentation for migration: Migrating From the XslTransform Class. So I rewrote the code, made some minor tweaks, checked the XSLT output against an XML payload. Perfect. And then I ran the code. Instead of getting a text file with an output like this:

Data Blah
Field 1: X
Field 2: Y
Field 3: Z
...

I got
Data Blah  Field 1: X  Field 2: Y  Field 3: Z

You might think, so what, the formatting is a bit screwed up. Sadly, the formatting is used by a secondary processor and the return carriages are the delimiters - part of which is a stored proc that I don't really want to change. 

So why is this happening? Well from what I can gather, it's all a part of the XmlReader and the way that it handles whitespace. Because this is from a few weeks ago, and is mainly notes to remind myself, I've forgotten most of the technical details (if I find my notes, I'll update this). Basically what ended up happening was that the XSL inserted text spaces such as #xD; or #xA; was effectively translating to a space when being read in. When the transformation happened to output the result as a text file through the XSLT, because the XML had been read in with spaces rather than Carriage Returns, the result ended up being one single line.

I played with much code, thought about rejigging the entire processing feed to insert and use a different delimiter character, and finally got a really stupidly simple thing to work. Rather than using just #xD;, I did a combination. So the XSL text became something like (note syntax may not be quite correct):


#xD; #xA;



I'm not sure why the two fields worked, when a singular one didn't, but hey, changing one line in an XSL rather than writing custom deserialiser is ok by me!

As a side note: #xD; is the same as  which is CR (Carriage return)
#xA; is the same as &a10, which is LF (Line feed)

From an XSL point of view, all of these values work, it's only when using the XMLSerializer in XMLReader which is used by XSLCompiledTransform that the transformations don't always work.

Wednesday, June 2, 2010

Force XML closing tag when using XML Serializer (in VB.NET)

To force an xml closing tag explicitly when using XML serializer (Kind of the vb version of this)

http://bytes.com/topic/net/answers/178893-force-xmlserializer-use-explicit-closing-tags-zero-length-strings


I wanted to serialize an empty object to 

xmlSerialize.serialize produced 

Without manually fixing up the produced xml, I decided to do like the above article and override the default WriteEndElement behaviour. Unfortunately the application is in vb (It turned out I didn't need to do it, but I got it working anyway )

So here's a rough conversion:

Imports System.Xml.Serialization

Public Class XMLTextWriterEE
    Inherits XmlTextWriter

    Public Sub New(ByVal sink As TextWriter)

        MyBase.New(sink)
    End Sub

    '''
    ''' Wrapper that forces more compact empty element end tags to be written whenever possible.
    '''
    Public Overrides Sub WriteEndElement()

        MyBase.WriteFullEndElement()


    End Sub

To use you just do the normal serialize call, but pass the stringWriter to your overridden property:
serializer.Serialize( new XmlTextWriterEE( destTextWriter), obj)

Sunday, March 28, 2010

General programming posts

Recently read articles that are worth sharing

Top 10 things that annoy programmers  Great list of things that annoy developers - in any language, at any time, and on any project! BTW, the rest of his blog is well worth a read!

Top 25 most dangerous programming mistakes - covering a little bit of everything regarding security including input validation, XSS, SQL Injection. If you aren't reading this blog and a you're a programmer, you should be :)

An Address on the Craft (Of Software Development) A little bit of nostalgia, some comments on reality, and a little bit of poetry.

Wednesday, February 17, 2010

Looping through XML

Code snippet for transforming an XML Document to an XDocument (I got this from another blog, though I've lost the link - google and you shall probably find). I didn't end up using it as I decided to work with XMLNodeLists. Personal project so performance is not an issue (if it's taking too long, I'll fix it)

       private static XDocument DocumentToXDocumentReader(XmlDocument doc)
        {
            return XDocument.Load(new XmlNodeReader(doc));
        }

Looping through XML Nodes ... one solution.Get the repeating nodes and set them to a nodelist. Loop through the nodelist using foreach function.

WebRequest request = HttpWebRequest.Create(url);
WebResponse response = request.GetResponse();
doc.Load(response.GetResponseStream());
XmlNodeList nodelist = doc.GetElementsByTagName("Item");
int i = 0;

foreach (XmlNode item in nodelist)
{
 // Do stuff like string temp = item.ChildNodes.Item(3).InnerText;
// In my case I'm passing the data from the XML payload into a list of objects
}

Sunday, February 14, 2010

My machine hates me - debugging hell

So I've been working on some code at work. In many ways it's is fairly straight forward. Porting unmanaged code into a managed environment. The whole thing is pseudo-synchronous, with the service setting up a call and dumping it in a queue, a listener picking it up and doing stuff, then putting it back, while the calling service checks for the result and uses the result.

All was going well.

The service to set up the command was working. Check.
The message was being picked up and processed correctly. Check.
The response was being generated correctly. Check
Hook it all up together ... and the response wasn't coming back.

To say I was quite confused was putting it mildly. I had effectively duplicated an existing process ... that was working perfectly (well it was after I fixed some minor configuration and environment issues). So what the heck was going on?

I debugged about 15 different ways. Ran it normally and wrote in some logging outputs. I attached to the service process. I attached to the async process. I attached to multiple processes. I injected data. I used our test stub. I hacked the data in the database. I ran unit tests. I ran a console runner.

In final desperation I handed the code over another dev and asked them to take a look. 1 shelveset and some database scripts to run against the base code. An hour or so later they came back (Actually it may have been earlier, but I'd been called to a meeting so was gone for an hour). Apparently it worked on their machine. No changes. No problems It magically worked.

Can I scream now?

So the long and the short of it? I have a very good document written with detailed instructions on setting up and debugging this code. I pretty much understand the code on a level I certainly didn't expect to have to know. I have no clue as to why my code is not running on my machine, yet will happily run on someone elses (although I swear it hates me, unlike my other old faithful which I tearfully left to a tester because she couldn't handle the RAM upgrade necessary to run my shiny pretties).

Which leads me to my conclusion. Timebox the problem (I probably spent far too much time on it, even if I did solve some other bugs in the process). Grab a partner to give you some fresh perspective. Give someone else the code to look at on a different machine. If you're like me, it may be that you dev environment may be the culprit. (That theory is being tested on Monday after I get the code running again on my machine. We are so not trusting the 'it worked on this machine, so that machine can go to prod' theory. I want answers damn it. If my machine is evil, so be it. She gets reimaged again - 2nd time this year - and I pull down the code and the sql scripts and try it again).

Some random notes around debugging a frustrating problem

  • Check your permissions on everything. You should be running under the lowest security necessary, however different OS's may differ slightly in their permission sets.
  • Check that the project has been set to run under the OS correctly. If you are on a 64bit machine and you are dealing with a 32bit dll/ COM interop, then the individual project will probably need to run as X86 rather than 'Any CPU'. (Main solution can be Any CPU, but project that hosts the 32 bit dll can't be)
  • Use unit tests to verify that each part is working in isolation
  • Run through the code without debug. Sometimes things work different in debug to when it runs 'normally' (i.e. no timeout issues)
  • Information is power. As a vb6 dev, easiest way to debug something complex was often to log to a file chunks of data. It's ugly, but it works. The more you know about the state of the data, the easier it is to pinpoint where in the code the problem is
  • Get someone else to help. Your eyes start to gloss over the details you need to be looking at.
Edit://
Turns out that I was a bit of a nong and had screwed up/ missed some dependency injection with unity. So all working now, but boy, what a mess to track down! Lucky one of the guys at work was more cluey than me and got it sorted!

    Saturday, January 30, 2010

    Linq 2 SQL - where statement querying multiple columns with the same value

    I have a search function that I need to be able to search for a piece of text against two columns in the database. In this case, I'm searching for the title of a book, but I also want to be able to search against the subtitle column with the same search value.

    For example, the book title may be "The Smith's Christmas", with a subtitle of "First in the Xmas dinners series". I know that 'dinner' is somewhere in the name of the book. I input the term 'dinner', and need it to search against both the Title and the Subtitle (which are two separate fields).

    This is some of my code.

    // Because I have multiple search fields, I create 'matches' as the search criteria that changes depending on what I input. Allows it to be a bit dynamic
    IQueryable<Book> matches = LibraryModelHelpers.dc.Books;

    // This is the lambda statement. Basically translates to get me a match where the title contains my input text OR the subtitle contains my input text
    matches = matches.Where(c => (c.Title.Contains(myInputText) || c.Subtitle.Contains(myInputText)));

     // Using the search statement, create an anonymous type which has a title, the author, the bookID and the cover information - I don't care about the rest for this operation
     var coverDetails = from c in matches
                                   select new { c.Title, c.Author, c.BookID, c.Cover };

    Tuesday, January 26, 2010

    XML node doesn't exist?

    This is from a small personal project using Amazon webservices. It's a pretty simple windows form desktop app that uses Linq to SQL. 

    Dealing with some evil XML from Amazon that seems to have several variations. In some responses the node is there, other responses the node doesn't exist and is called something else (for pretty much the same data). I'm being lazy and pulling out the data I need one node at a time which can be a problem when the nodes are not present.

    To get around this, for the nodes that I know to be problematic, I needed to check that the node was actually going to be available before doing the convert. I do this by looking at the exact node and check if it's null.

    doc.GetElementsByTagName("TotalReviews").Item(0) != null

    If you just look at the  doc.GetElementsByTagName("TotalReviews") != null it won't return null, but will fall over because there are no items to pull data from. I think when I looked at it it returned '0'. Anyway, you need the Item(0) as this is the value you are doing .InnerText on.

    I'm using the GetElementsByTagName as it's generally easier than a longwinded xpath, but in some cases this is just not appropriate. A few cases I've selected children nodes by their position. I know this code isn't pretty, it probably isn't the best way to do thing, but it works.

    using System.Xml;

    WebRequest request = HttpWebRequest.Create(url);
    WebResponse response = request.GetResponse();
    doc.Load(response.GetResponseStream());
                   
    //If in doubt, spit the xml out somewhere and look at the payload
    // doc.Save(Console.Out);
    ...

     if (doc.GetElementsByTagName("TotalReviews").Item(0) != null)
    {
          book.NumOfReviews = Convert.ToInt32(doc.GetElementsByTagName("TotalReviews").Item(0).InnerText);
    }
     else if (doc.GetElementsByTagName("TotalFeedback").Item(0) != null)
     {
          book.NumOfReviews = Convert.ToInt32(doc.GetElementsByTagName("TotalFeedback").Item(0).InnerText);
     }
         else
     {
            book.NumOfReviews = 0;
     }

    ...