Welsh Learner Resources

21. January 2013 19:33 by Chris in Welsh  //  Tags:   //   Comments (0)

Following up my translation resources posts, here are som other resources for Welsh learners (work in progress).

 

S4C

Hwb

BBC - try the 'BBC vocab' to help read some Welsh

http://www.bbc.co.uk/cymru/dysgu/ - http://www.bbc.co.uk/cymru/dysgu/dysgucymraeg/ 

Catchphrase

Pigion http://www.bbc.co.uk/radiocymru/safle/pigion/ - seem to have been budget cuts as doesn't offer the supporting information that it used to

Grammar including all lessons as a PDF

SSIW http://www.saysomethinginwelsh.com

Also their wiki, http://ssiw.pbworks.com, e.g. http://ssiw.pbworks.com/w/page/36027266/Fiction%20and%20Poetry 

 

Clic Clonc http://www.telesgop.co.uk/clicclonc/

Facebook

http://www.facebook.com/#!/learnwelshlanguage

http://www.facebook.com/#!/LearnWelsh 

 http://www.unilang.org/course.php?res=84

 

Old Articles

9. January 2013 16:53 by Chris in dev  //  Tags:   //   Comments (0)

I've started adding my old technical articles to this blog ascribed the dates they were originally published but I'll list the articles here as well though some apparently weren't permalinks so may have to remember/ dig out the originals of these. Somewhat surprisingly given many articles date back to 2003 the vast majority are still relevant to varying degrees. Italics and/ or italicised comments indicate those which have suffered with the passage of time, e.g. mobile development has moved on apace and exams have a habit of being deprecated.

If you do link through there seems to be no rhyme nor reason as to what ratign an article gets, as far as I can see anyway!

 

Welsh Online Dictionary Resources

7. January 2013 17:22 by Chris in Welsh  //  Tags:   //   Comments (1)

I'm learning Welsh - Dysgwr Cymraeg ydw i! It does seem a bit of a slog but, stating the obvious, learning a language is difficult. In my 6th year now of evening classes I recently attended a 2 day intensive new year revision course at Cardiff University motivated by the new year and accompanying resolutions. I even made it to my first Clonc Yn Y Cwtsh. This week I had better do some homework.

Intensive/ immersive is the way to go I think if you want to develop the skills quickly and efficiently, and if you have the time, admittedly for a little longer than the two days I managed and those were tiring enough! The tutors were very good and it was also good to meet more fellow Welsh learners. The primary feeling at the end of the two days however was exasperation that I hadn't had the opportunity to learn Welsh in school when it would have been much, much easier! At least this angle seems to be covered in Wales now. Saying this, I would probably have been one of those individuals who 'lost' the language after school and had to come back to study again anyway when their interest was renewed, as was the case with some of my course peers.

Anway, below are a few resources I find/ have found useful. I wanted to tie these together for my own reference and as we seem to be lacking any decent portal for Welsh learners centralising such information:

  • Cysill Ar-lein (http://www.cysgliad.com/cysill/arlein/)  - I've only had a quick play but looks great for the experienced learner or fluent speaker as it allows you to check whole chunks of text. This from the Language Technologies Unit of Canolfan Bedwyr - Bangor University's centre for Welsh language services, which seems to be the dominant such establishment in Wales. There is also an offline application for purchase (Cysgliad), though there is no indication of price on the website/ no ability to order online. A quick google indicated a price of £40 for Windows and that the Mac version was available for free(!).
  • Porth Termau/ Terminology Portal (http://ap.termau.org/) which provides a simple web interface to the underlying databases
  • The BBC Welsh dictionary (http://www.bbc.co.uk/wales/welshdictionary/en-cy/) which again is a web view on the Bangor database(s). Also see their other resources, e.g. Pigion and Catchphrase. I may well return to more general resources like these, Hwb, SSIW, Yr Bont, etc. in another post. 03/02/13 (thanks to Esyllt): also available via the dictionary pages the BBC's online verb conjugator.
  • Google translate (http://translate.google.co.uk) - which I use often and seems pretty good, albeit I'm only Canolradd so my judgement might be questionable. It has some nice tools not available on other solutions. It's also embedded into Google's Chrome browser - it will recognise the language and offer to translate whole pages should you use this browser. There are a number of other web browser plugin tools which I may return to at a later date, though I don't tend to use them.
  • Geriadur.net (http://www.geiriadur.net) - the dictionary from Trinity Saint David. I haven't used it much but know of others who prefer it.
  • 'Ap Geiriaduron' (http://www.bangor.ac.uk/canolfanbedwyr/ap_geiriaduron.php.en) All the above are online resources ... no good in the Kymin in Penarth during my lessons as connectivity is non-existent. For offline apps best served are Android and Apple users with 'Ap Geiriaduron'. Hopefully this will make it to other platforms as well in the not too distant future.
  • Geiriadur yr Academi 03/02/13 (thanks to Esyllt): another from Canolfan Bedwyr, though not listed in the Language Technologies Unit Websites page.

In summary, Canolfan Bedwyr seems to be dominating the market! Personally I used to use the BBC view on the database(s) but after buying a Nexus 7 this has largely been replaced with 'Ap Geiriaduron' and I also frequently use Google Translate. I should also, I think, give Cysill Ar-lein more of a go in terms of a learning aid.

Digon ar hyn o bryd. I may return with a list of some more general resources at a later date. Feel free to suggest some and I will collate. Similarly if I have missed any dictionary resources you like, please leave a comment via the below. Or if you know of any good web portals for Welsh learners as I've struggled to find one. So maybe I'll make one.

Chris.

Additional:

21/01/13: Glosbe - the multilingual online dictionary (http://glosbe.com/); also I've been using Google translate and it's not quite as good as I thought - gets a bit confused with the more complex ...

17/04/13: Gweiadur - I haven't delved into much and there a few 'inconsistencies' in this beta site, but looks to be an interesting project.

17/06/13: Just spotted Eurfa which also includes a downloadable dictionary.

 

Comment: A Microsoft/ Skype Fiasco (no decent Skype client for Windows Phone 7.8)

30. December 2012 16:17 by Chris in Windows Phone  //  Tags: ,   //   Comments (2)

Sometimes, more frequently than shoulf eb the case, it seems like Microsoft's left hand doesn't know what their right hand is doing, or an alternative idiom might be " it's arse from it's elbow".

So Microsoft buys Skype. It starts improving the integration with it's other products and service. I've never been a big Skype user but in part prompted by Microsoft's purchase I think it might be a good idea to get a Skype number for business calls. Rather than use my old, personal account I spot the fact that I can now sign in with (one of) my Microsoft logins - "Windows Live" I think is the current vernacular if marketing haven't changed it since last I looked. Great. This seems sensible - this should give me the option to pull in contcats from elsewhere ... should save some time and effort. The UI offers me the option to merge with an existing account - I don't go for this option as the UI doesn;t explain what exactly this means, and anyway surely I would be abel to perform a merge after regustration should I so choose?!

All good - I register, I buy my skype number. Side bar: it was impossible to find out how much this was goign to cost without going through the purchase process. Seems an OK price and there is a monthly option so I can see how it goes. All setup in a few minutes and I test the number and the allied voicemail. All good so I change the contact details on my email sig and website. Quite happy.

Oh dear though ... I go to my Nokia Lumia 800 to change the skype account to match the new one so that I may receive skype calls to my mobile and it doesn't accept the microsoft login or the accompanying autogenerated skype login. I must be doing something wrong - surely Microsoft wouldn't have cocked up like this? A quick google and yes, it seems this was a deliberate decision. I check for updates to the skype client - there haven't been any for months. I can't find any roadmap for OS/ device releases. I try the skype support twitter account - there is no response in 24hrs. So I fire off a support email. I keep it brief;):

There is no option on windows phone 7 skype client to login with a Microsoft account. Solution?

I am pleasantly surprised to receive a response within an hour or so. The response itself I am less happy with:

We understand that you wish to sign in to Skype using a Microsoft account on Windows 7. We'd be glad to look into this for you.

Unfortunately this feature is not currently available in Skype. 
 
We will pass on your request to our development team for consideration and potential inclusion in a future release.

Should any further issues arise, please feel free to contact us again.

Hmmm, Windows 7? Solution? Next email:

An entirely unsatisfactory response! It’s Windows Phone 7  and the question is outstanding – what is the recommended workaround solution to enable skype usage on win phone 7? E.g. do I need to create a new skype account and merge with the Microsoft account – will this work?

Skype support response (from different support person, continued quick response):

We understand that you want to sign in to Skype on your Windows 7 phone using your Microsoft account. We know how important this is for you, and we would like to inform you further about this issue.

To view your Microsoft contacts, you need to sign in using your Microsoft email address and password on Skype. Unfortunately, the option to do this is not yet available on your current device. Even if you merged your Microsoft account to a Skype account, you will still be required to sign in using Microsoft credentials to be able to view your Microsoft contacts.

Please accept our sincerest apologies on this matter, and we thank you for your patience and understanding.

 Which isn't actually my main issue. Next email:

Thanks for the rapid responses.

I don’t particularly mind if I don’t see my Microsoft related contacts – I can presumably set them up separately if I need to. What I would like to be able to do is receive Skype calls to my mobile via the Skype number I bought yesterday but which is currently associated with the Microsoft based account I also setup yesterday. I have a separate skype account ‘olops2000’ I have previously used but I created a new Microsoft linked account yesterday and did not merge accounts at the time as I was not made aware of the Window Phone limitation. Can the olops2000 account now me merged with the Microsoft account, for example, so that I may receive calls as above? If you can supply me a brief step by step as to how I currently can achieve this goal ether way that would be great.

Thanks in advance.

 Skype support's response:

We understand that you want to sign in using your Microsoft account so you can make use of the Skype Number that you have bought under your account. Please allow us to assist you with this issue.

Please be informed that your Microsoft account is currently merged to an automatically generated Skype ID live:chris.sully, which was created when you signed up for Skype using Microsoft credentials. You cannot use this Skype Name to log in nor can you reset the password for it. To access this account, you will need to use your Microsoft email address and password.

When we unmerge your accounts so that you may merge to an existing Skype account, please note that all purchases on the live:chris.sully account will be lost. Also, it is not possible to transfer purchases from one account to another. What we can recommend is that you use another device that supports logging in using a Microsoft account so that you may use the Skype Number that you have bought.

We hope you find this answer helpful. Should you need any further assistance or have additional questions, please do not hesitate to contact us again.

My turn:

Thanks for the clarity even though the situation is far from ideal.

Re: What we can recommend is that you use another device that supports logging in using a Microsoft account so that you may use the Skype Number that you have bought.

Is there a list of operating systems/ devices that support such functionality anywhere, i.e. a web page? Although given Microsoft now owns Skype I would hope/ expect the functionality would come to my device imminently? Would you be able to confirm if the functionality is planned for the next release of the Skype client for Windows phone 7.X (7.8 perhaps?) and/ or whether the functionality exists/ is planned for the Windows Phone 8 client? Also when these releases are scheduled for/ is there any ‘roadmap’  information available anywhere?

Thanks in advance.

Skype's turn (different person again):

Thank you for your reply.

We understand your concern that you want to know if you'll be able to log your Microsoft email in Windows phone 7.8 or 8. We'll be more than glad to further assist you.

Yes, you may use your Microsoft credentials on Windows phone 8. We're sorry that it's not available for version 7.8

Should any further issues arise, please feel free to contact us again.

So we get there in the end: another dead end for Windows Phone 7.X, even though Microsoft is now in charge! Rubbish!

 

Apps, Operating Systems and Devices

6. December 2012 16:00 by Chris in dev  //  Tags: , , ,   //   Comments (5)

The ratios of devices accessing content over the Internet has changed significantly over recent years. The chiefly impacting form factor here is the smartphone, with Apple driving matters with the iPhone and Android taking over in market share terms, and there are other players who will continue to attempt to challenge the current market dominance of the ‘big 2’ with perhaps the best bet being Microsoft though, admittedly, they have failed to make any significant impact with Windows Phone 7.X. This may change if Microsoft manage some decent and cross pollinating marketing of Windows 8, Windows RT and Windows Phone 8. I’m not holding my breath though. Hands up who knows the difference between WinRT and Windows RT, for example?

Anyway, each phone operating system and surrounding ecosystem has its strengths and weaknesses and I won’t enter into related discussions here. What I shall consider is the messy situation we have with ‘app’ development. The term ‘app’ has entered common parlance though I’m unsure what the shared understanding of the term actually is. Certainly this has been driven into the collective consciousness by Apple’s ‘AppStore’, and subsequently by the misleadingly named ‘Google Play’. Therefore ‘app’ refers to applications that are designed to be run on mobile devices – initially smartphones and then more recently on tablet devices such the iPad and the Nexus 7? Microsoft has jumped on the bandwagon with its similar Windows Phone Marketplace and, most recently, the Windows Store.

But ‘app’ just means application doesn’t it? So here each operating system has its own app store which delivers applications designed to be run on that operating system including, and this is key, a User eXperience consistence with the design values pertinent to the target device. Thus developers/ organisations, as part of their business model, may choose to target an individual platform for their application. If they have the right app each of the major platforms offers a significant market and this approach can work. The problem then comes if they wish to extend their market to other platforms – currently for the very best user experience they will need to develop that application in a quite different set of technologies which means that porting apps from one platform to another is expensive (I note that there are cross platform tools out there but, as far as I am aware, they remain largely unproven – see below).

Now switch to an alternate scenario of an organisation which is not targeting a platform but targeting an existing customer base and hence will find themselves in the situation of prioritising development of their apps for differing platforms. Take the example of a bank who wishes to produce an app for customer to perform account management. Why? Well for competitive advantage of course – to keep existing customers with them and to encourage new customers to them. They will need to produce and maintain multiple versions of the application for the different platforms: 2,3,4? Logically they would then continue to prioritise platforms based on the breakdown of their current/ targeted user base.

So, firstly is this situation any different from that with more traditional devices – Apple or Microsoft OS based desktop or laptop devices. Yes because the mobile nature of devices opens up so many more useful app scenarios and the app store concept has taken off. No in that as we had, and have, OS specific traditional client computing ‘apps’ – the solution was moving the applications to the web and to related cross platform technologies.

So there are two, related solutions to this problem area which is only going to get worse as the market further fractures with device form factors and operating systems:

  1. rather than having client applications specifically developed for each mobile OS let’s write them in HTML5 and related technologies. There has already been a bug push in the last 3years+ to push more and more functionality down to the client as devices became more powerful and this path offered more scalability than each client devices using significant computing resources. A caveat here – mobile devices offer significantly less computing resources than your desktop client, though this is changing quickly. Technology has a habit of doing this …
  2. rather than having client applications specifically developed for each mobile OS let’s write them in a generic fashion and rely more on using tools and technology to ‘translate’ these apps to work in a variety of client devices.

Or, probably, a bit of both. The downside? Well, there is device specific knowledge and trickery to ensuring optimal user experiences (particularly) in apps. Will the user experience be good enough for end users via cross-platform development solutions? I hope so. The current situation can’t be sustainable, can it?

Chris Sully
Technical Director, Propona

[first published: https://connect.innovateuk.org/web/propona/blog/-/blogs/apps-operating-systems-and-devices?ns_33_redirect=%2Fweb%2Fpropona%2Fblog]

 

Allowing for User Mischief - A Bad Word Filter

12. April 2003 15:39 by Chris in   //  Tags:   //   Comments (1)

Note that this article was first published on 02/01/2003. The original article is available on DotNetJohn, where the code is also available for download

Introduction

This article presents an improved core implementation of a solution to a particular problem I come across occasionally: detection and/ or removal of suspect words from user supplied text on web sites. A typical application scenario might be a discussion forum. For example, I’ve worked on a few sports related web sites where discussions can become …heated and the language used occasionally strays into that inappropriate for the general audience of the site. There are several approaches to dealing with this problem, some of which are discussed in my previous articles on this subject published on ASPAlliance ( http://www.aspalliance.com/sullyc/articles/user_mischief.aspx - no longer available) and 15Seconds ( http://www.15seconds.com/issue/030121.htm - no longer available). The latter article looks at a composite control based implementation by the way. As indicated in these articles I suggest the best way would be identification and removal of any suspect words.

A Starting Point

My previous implementation was based on a word and/ or word fragment ('word roots') list defined with an XML document. The items from this list were then compared against the user-inputted text string and matches highlighted using the string manipulation functionality available in .NET. For re-usability I decided on a user control. See the first article ( http://www.aspalliance.com/sullyc/articles/user_mischief.aspx ) for full details but the core of the implementation is reading the XML into a local data structure for subsequent direct comparison, in this case an ArrayList:

 <%@Control Language="VB"%>
 <%@ Import Namespace="system.Xml" %>
 
 <script language="VB" runat="server">
 
 Dim alWordList As new ArrayList
 
 Sub Page_Load()
   dim xmlDocPath as string = server.mappath("bad_words.xml")
   dim xmlReader as XmlTextreader = new xmlTextReader(xmlDocPath)
   while (xmlReader.Read())
     if xmlReader.NodeType=xmlNodeType.Text then
       alWordList.Add(xmlReader.Value)
       trace.write("Added: " & xmlReader.Value)
     end if
   end while
   xmlReader.Close()
 End Sub
 
 Public Function CheckString(InputString as String) as string
   dim element as string
   dim output as string
   trace.write("Checking " & InputString)
   For Each element in alWordList
     trace.write("Checking: " & element)
     InputString=InputString.Replace(element,"****")
   Next
   trace.write("Returning " & InputString)
   Return InputString
 End Function
 
 </script> 

with the XML file being of the format:

 <?xml version="1.0"?>
 <words>
   <word>word root 1</word>
   <word>word root 2</word>
 </words> 

With the actual words replaced to protect the innocent.

Then all that remains is capturing of user text, via a textbox perhaps, registering the user control for use in the page:

 <anti_swear:cleanup id="cleanup1" runat="server" /> 

and using the control to check the inputted text for ‘naughty’ word roots:

 dim clean_text as string
 
   clean_text=tbMessage.text ‘text to be checked
   trace.write("message text: " & clean_text)
 
   clean_text=cleanup1.CheckString(clean_text)
   trace.write("message text (cleaned): " & clean_text)
 
   if clean_text<>tbMessage.text then
     trace.write("Text not clean!")
     tbMessage.text=clean_text
     lblInfo.Text="Naughty words found ... please remove!"
   else
     'all is OK … submit to db/ other permanent store for later recall 

So, in this simple implementation CheckString returns a string with naughty word roots replaced by ‘****’ and we can detect if such words have been found as the returned text will be different from that passed into the function.

The actual detection is very simple:

 InputString=InputString.Replace(element,"****") 

too simple in fact as we’ll shortly explore.

Note that in the XML document I’ve used the phrase ‘word root 1’: important as it is only the roots of suspect vocabulary that you need to place in the XML document, thus reducing the effort involved for you. This should limit the number of XML elements you need to introduce to cover the commonly used expletives but also means care must be taken not exclude perfectly acceptable words.

The Problem

What’s the problem? Well, you may well have already realised that as pointed out to me by my fellow ASPAlliance columnist Jonathan Cogley (and as alluded to in the last paragraph of the last section):

Your approach uses a regular text replace which could create a new problem. Since it will identify the offending sequence of letters in perfectly harmless words e.g. scunthorpe would be rendered as s****horpe.

Jonathan suggested two possible solutions, with my thoughts on implementation also below:

  • a white word list – a list of permissible words to be checked if a word on the black word list was found. This approach I believe to be too prone to programmer 'error' – there are too many language combinations to provide a sleek solution.
  • regular expressions – the powerful language of regular expressions should be able to provide a better matching algorithm that would alleviate the problem.

The Solution

Let’s consider and see if we can find a better solution. An obvious starting point is the example exception above and thus the regular expression concept of word boundaries.

Scunthorpe (a town in the UK for our international readers and possible towns elsewhere in the world for all I know).

As we’re interested in roots of words we’d prefer that Scunthorpe not match because it starts with an S and hence shouldn’t be offensive to anyone. However we are interested in matching any derivatives of our dubious root words so whilst we want to specify the beginning word boundary we shouldn’t be interested in the ending word boundary.

In regular expressions word boundaries are identified via the concept of an anchor. Anchors specify the position where the pattern occurs. For example:

^ Matches at the start of a line.
$ Matches at the end of a line.
\< Matches at the beginning of a word.
\> Matches at the end of a word.
\b Matches at the beginning or the end of a word.
\B Matches any character not at the beginning or end of a word.

Thus the above include a few options we’re interested in. Let’s use \b, the word boundary anchor. This represents anything that can come before or after a word, e.g. white space, punctuation and/or the beginning or end of a line.

So we want to engage in a regular expression search / replace for '\broot word'. This should solve our problem. How do we do this in .NET?

Regular Expression Solution in .NET

We’re going to focus on solving this little problem and shall not be considering the range of extensive support for Regular Expressions in .NET. However, look out for such an article on dotnetjohn in the near future.

There are a variety of supporting classes we could use:

Regex: the Regex class represents a regular expression. It also contains static methods that allow use of other regular expression classes without explicitly instantiating objects of the other classes.

Match: the Match class represents the results of a regular expression matching operation.

MatchCollection: the MatchCollection class represents a sequence of successful non-overlapping matches.

An example of how we might utilize the Regex class is:

 Dim r As Regex = New Regex("\b" & “NaughtyRoot”)  
 

Further, among the members of the Regex class are:

IsMatch - indicates whether the regular expression finds a match in the input string.

Match - searches an input string for an occurrence of a regular expression and returns the precise result as a single Match object.

Matches - searches an input string for all occurrences of a regular expression and returns all the successful matches as if Match were called numerous times.

Replace - replaces all occurrences of a character pattern defined by a regular expression with a specified replacement character string.

In line with our previous implementation we would use the Replace function, replacing our CheckString function with:

 Public Function CheckString(InputString as String) as string
   Dim r As Regex
   dim element as string
   dim output as string
   trace.write("Checking " & InputString)
   For Each element in alWordList
     r = New Regex("\b" & element)
     trace.write("Checking: " & element)
     InputString=r.Replace(InputString,"****")
   Next
   trace.write("Returning " & InputString)
   Return InputString
 End Function 

Which does indeed do what we wish. One caveat is that as we are only checking for the beginning of words some swear words may slip through the net if we don’t explicitly add them to the bad words list. ‘Motherf**ker’ is an example. I can’t think of an easy way around this problem however. You could extend the solution to include end of word boundaries but then you need to include ‘f**ker’ as well as ‘f**k’, for example. Plus, you increase the risk of trapping valid words.

Note also that the provided solution is not perfect on the grounds that some valid words will no doubt still be challenged by this solution. I do believe it is a good compromise, however. It might be a good option to change the language of the interface to indicate the presence of 'possibly suspect words' and to let the user edit the text. It should be obvious to the user why their text has been returned to them.

Conclusion

I hope this article has provided a useful extension to my earlier articles on the subject and in doing so introduced some readers to the powerful language provided by regular expressions. If you’d like to raise any points about this article, in particular thoughts on how the solution could be improved, email me (sullyc-olops@btinternet.com ).

The Zipfile

The zipfile includes the following:

markII.aspx web form page with text box and calling user control methods.

user_controls
/anti_swear.ascx string based version
/anti_swear2.ascx regular expression based version
/bad_words.xml

To use, populate bad_words.xml and alter the user control reference in markII.aspx to see the differences between the versions.

You may download the code here.

XSD Schemas: An Introduction

31. March 2003 15:35 by Chris in dev  //  Tags:   //   Comments (0)

Note that this article was first published on 02/01/2003. The original article is available on DotNetJohn.

Introduction

The XML Schema definition language (XSD) enables you to define the structure (elements and attributes) and data types for XML documents. It enables this in a way that conforms to the relevant W3C recommendations for XML schema. XSD is just one of several XML schema definition languages but is the one best supported by Microsoft in .NET.

The schema specifies the ordering of tags in the document, indicates fields that are mandatory or that may occur different numbers of times, gives the datatypes of fields and so on. The schema importantly is able to ensure that data values in the XML file are valid as far as the parent application is concerned.

Schemas are also useful when developers in different companies or even in different parts of the same company read and write XML documents that they will share. The schema acts as a contract specifying exactly what one application or part of an application must write into an XML file and another program can expect to be there. The schema unambiguously states the correct format for the shared XML.

A well formed XML document is one that satisfies the usual rules of XML. For example, in a well formed document there is exactly one data root node, all opening tags have corresponding closing tags, tag names do not contain spaces, the names in opening and closing tags are spelt in exactly the same way, tags are properly nested, etc.

A valid document is one that is well formed and that satisfies a schema.

Visual Basic .NET provide several methods for validating an XML document against a schema. There are articles on how to do this already on dotnetjohn ( Upload an XML File and Validate Against a Schema ). The focus of this article however shall be on the basic elements of the Microsoft preferred schema language (XSD), after a brief history lesson / an introduction to other common types of schema you may come across and why the XSD alternative was developed.

DTD and XDR

While XML is a relatively new technology the need for schemas was recognised early and so several have already been created. Microsoft focuses heavily on the most recent version, XSD, so VB has the most support for this form of schema, and hence will probably be the one Microsoft developers use most.

However, you may well happen upon the situation, particularly with enterprise development, where you are required to work with other forms of XML schema. While VB has few tools for building other types of schema, it can validate data using DTD and XDR.

The first schema standard was developed alongside XML v1.0 and is DTD (Document Type Definition) schemas. This, many believed, was not an ideal solution as a schema definition language which is why Microsoft came up with XSD as its own suggested replacement and submitted this to the W3C for consideration. One of the problems was, and is, that DTDs are not XML based so you have yet another language to learn to go with the proliferation that comes with XML (XPath and XSL for example). Further, developers also found that DTD lacked the power and flexibility they needed to completely define all of the datatypes they wanted to represent in XML. A schema that can’t validate all of the data’s requirements is of limited use.

XDR (XML Data Reduced) is another schema language, this time XML based and providing a superset of the functionality of DTDs. XDR should not be confused with Sun’s XDR (External Data Representation) … another format for data description but in this case physical representation of data rather than logical representation as per XML and XML Data Reduced schemas.

The last few paragraphs were just to let you know there are other schema formats out there, some of which have limited support in .NET. Now we’ll focus on XSD.

XSD

Wherever you see 'schema' from now we’re referring to XSD. As per many topics relating to XML (see my article on XSL Understanding How to Use XSL Transforms) the XSD specification is complex as well as being quickly evolving. The following will cover the basics of XSD so you can start to construct some useful schemas for use in your own applications. You’ll then need to follow up the information presented elsewhere.

Note that Visual Studio .NET includes an XSD editor that makes generating schemas relatively painless. Unless you understand some of the basic rules of XSD, however, the editor may prove a tad confusing.

Types and Elements

XSD schemas contain type definitions and elements. A type definition defines an allowed XML data type. An 'address' might be an example of a type you might want to define. An element represents an item created in the XML file. If the XML file contains an Address tag, then the XSD file will contain a corresponding element named Address. The data type of the Address element indicates the type of data allowed in the XML file’s Address tag.

Type definitions may be simple or complex. Simple and complex types allow definition of the new data types in addition to the 19 built in primitive data types which include string, Boolean, decimal, date, etc.

A simpleType allows a type definition for a value that can be used as the content of an element or attribute. This data type cannot contain elements or have attributes.

A complexType allows a type definition for elements that can contain attributes and elements.

Let’s pause here and take a look at an example. Let’s work backwards from an XML document as I’ll assume we’re all reasonably familiar with XML but less so with XSD. Here’s an XML file representing a simplified contacts database containing just one record currently:

 <?xml version="1.0" encoding="utf-8" ?>
 <Contacts>
   <Contact>
     <FirstName>Chris</FirstName>
     <Surname>Sully</Surname>
     <Address>
       <Street>22 Denton Road</Street>
       <City>Cardiff</City>
       <Country>Wales</Country>
     </Address>
     <Tel>02920371877</Tel>
   </Contact>
 </Contacts> 

In fact in Visual Studio .NET you can simply right click on this XML file and generate the schema from it. Of course it may well not be quite correct for your needs, as it shall be based on one record of data. Not even Visual Studio .NET can predict the future with any accuracy … ;) Here’s what it comes up with:

 <?xml version="1.0" ?>
 <xs:schema id="Contacts" targetNamespace="http://tempuri.org/XMLFile1.xsd" xmlns:mstns="http://tempuri.org/XMLFile1.xsd" xmlns="http://tempuri.org/XMLFile1.xsd" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:msdata="urn:schemas-microsoft-com:xml-msdata" attributeFormDefault="qualified" elementFormDefault="qualified">
   <xs:element name="Contacts" msdata:IsDataSet="true" msdata:Locale="en-GB" msdata:EnforceConstraints="False">
     <xs:complexType>
       <xs:choice maxOccurs="unbounded">
         <xs:element name="Contact">
           <xs:complexType>
             <xs:sequence>
               <xs:element name="FirstName" type="xs:string" minOccurs="0" />
               <xs:element name="Surname" type="xs:string" minOccurs="0" />
               <xs:element name="Tel" type="xs:string" minOccurs="0" />
               <xs:element name="Address" minOccurs="0" maxOccurs="unbounded">
                 <xs:complexType>
                   <xs:sequence>
                     <xs:element name="Street" type="xs:string" minOccurs="0" />
                     <xs:element name="City" type="xs:string" minOccurs="0" />
                     <xs:element name="Country" type="xs:string" minOccurs="0" />
                   </xs:sequence>
                 </xs:complexType>
               </xs:element>
             </xs:sequence>
           </xs:complexType>
         </xs:element>
       </xs:choice>
     </xs:complexType>
   </xs:element>
 </xs:schema> 

Picking out some key elements:

An XML Schema is composed of the top-level schema element. The schema element definition must include the following namespace:

http://www.w3.org/2001/XMLSchema

and you can see from the above this isn’t all that is generated, but we’ll ignore the extra elements for now.

The actual definition commences with the first <xs:element… definition which has the name attribute 'Contacts'. Again the other attributes we can ignore for now. Contacts is necessarily defined as a complex type as it contains other elements.

We then encounter <xs:choice…: a choice element allows the XML file to contain one of the elements inside the choice element. The attribute maxOccurs="unbounded" is used to indicate that the Contacts element can contain any number of Contact elements.

The contact element is again a complex type comprised of a sequence of further elements. A sequence element requires the XML document to contain the items inside the sequence in order. By default sequence elements must appear exactly once; this can be overridden using the minOccurs and maxOccurs attributes to indicate it can occur any number of times (including 0).

The individual elements are defined to be of type string (a simple type). Address is similar defined as a complex type of a sequence of elements of type string.

Hopefully that has been an informative introduction to some commonly encountered constructs by way of an example. We’ll now continue on to look at some of the XSD language constructs in a little more detail, starting with elements.

Elements and their attributes

e.g. <xs:element name="Street" type="xs:string" minOccurs="0" />

An element defines an entity in an XML file. So the above defines an element of name <Street> and type string. The element can have several attributes which modify the elements behaviour, for example:

minOccurs and maxOccurs: as indicated already these give the minimum and maximum allowed number of times an element can occur within a complex type. To make an element optional minOccurs is set to 0. To allow an unlimited number of the element maxOccurs is set to 'unbounded'.

ref: makes the element a copy of another element defined in the schema. This is best avoided however... it is better to define a distinct type and base both element definitions on this type rather than introduce such dependencies into the schema, e.g.

 <xsd:simpleType name=”PhoneNumberType>
   <xsd:restriction base=”xsd:string/>
 </xsd:simpleType>
 ...
 <xsd:complexType name=”Contact>
   <xsd:sequence>
   ...
     <xsd:element name=”HomeTeltype=”PhoneNumberType/>
     <xsd:element name=”WorkTeltype=”PhoneNumberType/>
   ...
   <xsd:sequence>
 </xsd:complexType> 

Though you might at the same time like to tie down your definition of the PhoneNumberType more tightly. We’ll return to <xsd:restriction … shortly.

default: assigns a default value to the element in which case if the XML document omits the corresponding field it will be assumed to have this value. An element that has a default value should also have minOccurs set to 0 so the XML document may omit it.

fixed: gives the element an unchangeable value. The corresponding XML element cannot have another value, although it may be omitted if minOccurs is 0. Why is this useful? Well, you may want to ensure that an XML data field has the same value throughout the document; for example, you may want to add a new Country field to an existing XML document and ensure that its value is UK for every record.

Types

Type definitions have two goals:

  1. To describe the data allowed in a simple field, e.g. text format of an e-mail address. Simple types achieve this goal.
  2. To describe relationships amongst different fields, e.g. a contact type consists of a sequence of firstname, surname, telephone, etc. Complex types achieve this goal of designing more complex data types.

In addition to simple and complex types there are built in types, similar to simple data types such as integers, dates, etc. in other programming languages or .NET’s value data types provided by the Common Type System (CTS). We’ve seen one of the built in types already in our element definitions in the form of the often-employed string type. These built in types are W3C defined and include date, dateTime, decimal, double, float, Year, etc. etc. See the SDK documentation for an authoritative list.

A facet is a characteristic of a data type that you can use to restrict the values allowed by a type. Facets are effectively attributes of the data type. For example, the string datatype has a maxLength facet. Again for further details of the facets of each built in type see the SDK documentation.

Facets enable short cuts to building simple types by restricting another data type. We’ve already seen the restriction construct in example code above; using this and the enumeration facet of the string built in data type we can define allowable values for a type, e.g.

 <xsd:simpleType name=”Colours>
   <xsd:restriction base=”xsd:string>
     <xsd:enumeration value=”red/>
     <xsd:enumeration value=”green/>
     <xsd:enumeration value=”blue/>
   </xsd:restriction>
 </xsd:simpleType> 

The pattern facet is particularly powerful as it specifies a regular expression that the XML field data must match. Regular Expressions are worthy of an article or three in themselves, and there are several books on the subject if interested in improving your knowledge. Look out for an article on regular expressions on dotnetjohn in the not too distant future! For now, we’ll largely skip over the topic of regular expressions though here is an example:

 <xsd:simpleType name=”emailType>
   <xsd:restriction base=”xsd:string>
     <xsd:pattern value=”[^@]+@[^@]+\.[^@]+” />
   <xsd:restriction>
 <xsd:simpleType /> 

Let’s decipher ”[^@]+@[^@]+\.[^@]+” – this matches an e-mail address of the form a@b.c where a,b and c are any strings that do not contain the @ symbol. The value string equates to 'match any character other than the @ symbol one or more times; then match an @ symbol; then again match any character other than the @ symbol one or more times; next match a full stop and then once more any character other than the @ symbol one or more times'.

The use of the length, minLength, maxLength, totalDigits, fractionDigits, minExclusive, maxExclsuive, minInclusive, maxInclusive facets are all self-describing but it’s important to know they are available.

In addition to the primitive built in types there exist built in data types derived from these primitive types. These derived built in data types refine the definition of primitive types to create more restrictive types. They are based on the string and decimal primitive types.

The string derived types represent various entities that occur in XML syntax itself. For example, the Name type represents a string that satisfies the form of XML token names – it begins with a letter, underscore or colon and the rest of the string contains letters and digits.

The decimal derived types represent various kinds of numbers and thus are considerably more useful for validating data. There are thirteen such decimal derived types, e.g. byte, int, negativeInteger. See the SDK documentation for the full list.

Attributes

Just as you use an XSD schema’s element entities to define the data that can be contained in the corresponding XML data elements you can use attribute entities to define the attributes the XML element can have. Let’s return to Visual Studio .Net and see what schema it comes up with for the following small attribute-centric piece of XML.

 <contacts>
   <contact firstname=”ChrisSurname=”Sully/>
 </contacts> 

 

 <?xml version="1.0" ?>
 <xs:schema id="contacts" targetNamespace="http://tempuri.org/attribute_centric.xsd" xmlns:mstns="http://tempuri.org/attribute_centric.xsd" xmlns="http://tempuri.org/attribute_centric.xsd" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:msdata="urn:schemas-microsoft-com:xml-msdata" attributeFormDefault="qualified" elementFormDefault="qualified">
   <xs:element name="contacts" msdata:IsDataSet="true" msdata:Locale="en-GB" msdata:EnforceConstraints="False">
     <xs:complexType>
       <xs:choice maxOccurs="unbounded">
         <xs:element name="contact">
           <xs:complexType>
             <xs:attribute name="firstname" form="unqualified" type="xs:string" />
             <xs:attribute name="Surname" form="unqualified" type="xs:string" />
           </xs:complexType>
         </xs:element>
       </xs:choice>
     </xs:complexType>
   </xs:element>
 </xs:schema> 

You can see that attributes equate to the

<xs:attribute name="firstname" form="unqualified" type="xs:string" />

construct. The form attribute of the attribute tag (yes that does make sense!) is set to unqualified. This means that the attributes in the XML file do not need to be qualified by the schema’s namespace.

Why use attributes rather than elements (referred to as attribute-centric and element-centric XML)? Well, they are often interchangeable and it is largely a matter of taste. Generally however, elements should contain data and attributes should contain information that describes the data. So for contacts one could recommend an attribute centric approach.

However, such decisions are mitigated by the following:

  • the attribute centric approach consumes less file space
  • attributes can specify default values whereas elements generally do not
  • you can order elements via the sequence construct; there is no method of enforcing order with attributes
  • elements can occur more than once in complex types but an attribute can occur only once

Complex Types

As previously stated, whereas a simple type determines the type of data a simple text field can hold, a complex type defines relationships among other types. For example we defined a contact record to include fields to store first name and surname. Simple types are then used to define the allowable values in the fields. The complex type determines the fields that make up the contact type.

Complex types are also useful for defining XML elements that can have attributes – simple types cannot have attributes. A complex type can contain only one of a small number of elements. The elements within that element define the relationship the complex type represents. The most common elements are simpleContent, sequence, choice and all, as follows:

simpleContent: a complex type that contains a simpleContent element must contain only character data or a simple type. This construct is primarily so one may add attributes to a simple type.

sequence: as we’ve seen this allows specification of a required order to elements of a complexType.

choice: again as we’ve seen the corresponding XML data must include exactly one of the elements listed inside the choice. Note this is entirely different from the enumeration facet previously introduced: rather than a fixed set of values the choice construct allows the complex type to contain one of several types.

all: when a complex type includes the all element the corresponding XML data can include some or all of the listed elements in any order.

Named and Unnamed Types

Finally, to finish off our overview: if you will use a type only once there is no need to give it a name as you will not need to reference it again. You may choose to include the type definition in the code which uses it. This is the case for the Visual Studio generated schemas above, e.g.:

 <xs:complexType>
   <xs:choice maxOccurs="unbounded">
     <xs:element name="contact">
       <xs:complexType>
         <xs:attribute name="firstname" form="unqualified" type="xs:string" />
         <xs:attribute name="Surname" form="unqualified" type="xs:string" />
       </xs:complexType>
     </xs:element>
   </xs:choice>
 </xs:complexType> 

If the schema referenced this complexType again it would be more succinct to add a name attribute to the <xs:complexType … definition so you could reference it again later in the schema definition.

To clarify by example, the following uses the email simple type we introduced earlier to reduce the size of an ‘EmailContactType’ definition:

 <xsd:simpleType name=”emailType>
   <xsd:restriction base=”xsd:string>
     <xsd:pattern value=”[^@]+@[^@]+\.[^@]+” />
   <xsd:restriction>
 <xsd:simpleType />
 
 <xsd:complexType name=”emailContactType>
   <xsd:sequence>
     <xsd:element name=”nametype=”xsd:string/>
     <xsd:element name=”emailtype=”emailType/>
   </xsd:sequence>
 </xsd:complexType> 

Alternatively you could have defined the email type within the email element. Whether reusing or not, employing this convention will generally make your code tidier.

Conclusion

There we shall halt our introduction to XML Schemas and to the basic XSD constructs specifically and hope you are better placed to understand why and how to use XML schemas in .NET

References

.NET Framework SDK documentation

Visual Basic .Net and XML
Stephens and Hochgurtel

Programming Visual Basic .NET
Francesco Balena
Microsoft Press

Error Handling in ASP.NET

30. March 2003 13:40 by Chris in dev  //  Tags: ,   //   Comments (0)

Introduction

Note that this article was first published on 30/03/2003. The original article is available on DotNetJohn.

I came to the realisation a little time ago that I really wasn’t making the most of the error handling facilities of ASP.NET. The sum of my ASP.Net error handling knowledge up until that point was the new (to VB) Try … Catch … Finally construct at the page level. While I shall examine this new addition there are more facilities at our disposal and this article shall share the results of my recent investigations.

In anything but the simplest of cases your application WILL contain errors. You should identify where errors might be likely to occur and code to anticipate and handle them gracefully.

The .NET Framework’s Common Language Runtime (CLR) implements exception handling as one of its fundamental features. As you might expect due to the language independence of the framework you can also write error handling in VB.NET that can handle errors in C# code, for example. The exception object thrown is of the same primitive type (System.Exception).

Options … Options (preventing runtime errors)

If we can we should capture errors as early as possible as fewer will then make it through to the runtime environment. VB.Net offers the Option Strict and Option Explicit statements to prevent errors at design time.

Classic ASP programmers should be familiar with Option Explicit – it forces explicit declaration of all variables at the module level. In addition when programming ASP.NET pages in VB.NET when Option Explicit is enabled, you must declare all variables using Public, Private, Dim or Redim.

The obvious mistake that Option Explicit captures is using the same variable name multiple times within the same scope … a situation which is very likely to lead to runtime errors, if not exceptions.

Thankfully, in ASP.NET Option Explicit is set to on by default. If for any reason you did need to reset, the syntax is:

 Option Explicit Off

at the top of a code module or

 <%@ Page Explicit=”False”%> 

in a web form.

Enabling Option Strict causes errors to be raised if you attempt a data type conversion that leads to a loss in data. Thus, if this lost data is of potential importance to your application Option Strict should be enabled. Option Strict is said to only allow ‘widening’ conversions, where the target data type is able to accommodate a greater amount of data than that being converted from.

The syntax is as per Option Explicit.

Exceptions

What precisely is an exception? The exception class is a member of the System namespace and is the base class for all exceptions. Its two sub-classes are the SystemException class and the ApplicationException class.

The SystemException class defines the base class for all .NET predefined exceptions. One I commonly encounter is the SQLException class, typically when I haven’t quite specified my stored procedure parameters correctly in line with the stored procedure itself.

When an exception object is thrown that is derived from the System.Exception class you can obtain information from it regarding the exception that occurred. For example, the following properties are exposed:

HelpLink Gets or sets a link to the help file associated with this exception.
InnerException Gets the Exception instance that caused the current exception.
Message Gets a message that describes the current exception.
Source Gets or sets the name of the application or the object that causes the error.
StackTrace Gets a string representation of the frames on the call stack at the time the current exception was thrown.
TargetSite Gets the method that throws the current exception.

See the SDK documentation for more information.

The ApplicationException class allows you to define your own exceptions. It contains all the same properties and methods as the SystemException class. We shall return to creating such custom exceptions after examining the main error handling construct at our disposal: 'Try … Catch … Finally'.

Structured and Unstructured Error Handling

Previous versions of VB had only unstructured error handling. This is the method of using a single error handler within a method that catches all exceptions. It’s messy and limited. Briefly, VB’s unstructured error handlers (still supported) are:

On Error GoTo line[or]label
On Error Resume Next
On Error GoTo 0
On Error GoTo -1

But forget about these as we now have new and improved C++ like structured exception handling with the Try ... Catch … Finally construct, as follows:

Try
[ tryStatements ]
[ Catch [ exception [ As type ] ] [ When expression ]
[ catchStatements ] ]
[ Exit Try ]
...
[ Finally
[ finallyStatements ] ]
End Try

Thus we try to execute some code; if this code raises an exception the runtime will check to see if the exception is handled by any of the Catch blocks in order. Finally we may execute some cleanup code, as appropriate. Exit Try optionally allows us to break out of the construct and continue executing code after End Try. When optionally allows specification of an additional condition which must evaluate to true for the Catch block to be executed.

Here’s my code snippet for SQL server operations by way of a simple, and not particularly good (see later comments), example:

 Try
 
   myConnection.open()
   myCommand = new SQLCommand("USP_GENERIC_select_event_dates", myConnection)
   myCommand.CommandType = CommandType.StoredProcedure
 
   myCommand.Parameters.Add(New SQLParameter("@EventId",SQLDBType.int))
   myCommand.Parameters("@EventId").value=EventId
 
   objDataReader=myCommand.ExecuteReader()
 
 Catch objError As Exception
 
   'display error details
   outError.InnerHtml = "<b>* Error while executing data command (ADMIN: Select Event Dates)</b>.<br />" _
   & objError.Message & "<br />" & objError.Source & _
   ". Please <a href='mailto:mascymru@cymru-web.net'>e-mail us</a> providing as much detail as possible including the error message, what page you were viewing and what you were trying to achieve.<p /><p />"
 
   Exit Function ' and stop execution
 
 End Try 

There are several problems with this code as far as best practice is concerned, the more general of which I’ll leave to the reader to pick up from the following text, but in particular there should be a Finally section which tidies up the database objects.

Note it is good form to have multiple Catch blocks to catch different types of possible exceptions. The order of the Catch blocks affects the possible outcome … they are checked in order.

You can also throw your own exceptions for the construct to deal with; or re-throw existing exceptions so they are dealt with elsewhere. See the next section for a little more detail.

You could just have a Catch block that trapped general exceptions – exception type ‘exception’ (see above!). This is not recommended as it suggests a laziness to consider likely errors. The initial catch blocks should be for possible specific errors with a general exception catch block as a last resort if not covered by earlier blocks.

For example, if accessing SQLServer you know that a SQLException is possible. If you know an object may return a null value and will cause an exception, you can handle it gracefully by writing a specific catch statement for a NullReferenceException.

For you information here’s a list of the predefined exception types provided by the .NET Runtime:

Exception typeBase typeDescriptionExample
Exception Object Base class for all exceptions. None (use a derived class of this exception).
SystemException Exception Base class for all runtime-generated errors. None (use a derived class of this exception).
IndexOutOfRangeException SystemException Thrown by the runtime only when an array is indexed improperly. Indexing an array outside its valid range: arr[arr.Length+1]
NullReferenceException SystemException Thrown by the runtime only when a null object is referenced. object o = null; o.ToString();
InvalidOperationException SystemException Thrown by methods when in an invalid state. Calling Enumerator.GetNext() after removing an Item from the underlying collection.
ArgumentException SystemException Base class for all argument exceptions. None (use a derived class of this exception).
ArgumentNullException ArgumentException Thrown by methods that do not allow an argument to be null. String s = null; "Calculate".IndexOf (s);
ArgumentOutOfRangeException ArgumentException Thrown by methods that verify that arguments are in a given range. String s = "string"; s.Chars[9];
ExternalException SystemException Base class for exceptions that occur or are targeted at environments outside the runtime. None (use a derived class of this exception).
ComException ExternalException Exception encapsulating COM HRESULT information. Used in COM interop.
SEHException ExternalException Exception encapsulating Win32 structured exception handling information. Used in unmanaged code interop.

Throwing Exceptions

As indicated earlier, not only can you react to raised exceptions, you can throw exceptions too when needed. For example, you may wish to re-throw an exception after catching it and not being able to recover from the exception. Your application-level error handling could then redirect to an appropriate error page.

You may further wish to throw your own custom exceptions in reaction to error conditions in your code.

The syntax is

Throw new [Exception]

Creating Custom Exceptions

As mentioned earlier, via the ApplicationException class you have the ability to create your own exception types.

Here’s an example VB class to do just that:

 Imports System
 Imports System.Text
 
 Namespace CustomExceptions
 
   Public Class customException1: Inherits ApplicationException
 
     Public Sub New()
 
       MyBase.New("<H4>Custom Exception</H4><BR>")
       Dim strBuild As New StringBuilder()
       strBuild.Append("<p COLOR='RED'>")
       strBuild.Append("For more information ")
       strBuild.Append("please visit: ")
       strBuild.Append("<a href='http://www.cymru-web.net/exceptions'>")
       strBuild.Append("Cymru-Web.net</a></p>")
       MyBase.HelpLink = strBuild.ToString()
 
     End Sub
 
   End Class
 
 End Namespace 

Looking at this code. On line 6 you see that to create a custom exception we must inherit from the ApplicationException class. In the initialization code for the class we construct a new ApplicationException object using a string (one of the overloaded constructors of ApplicationException – see the .NET documentation for details of the others). We also set the HelpLink property string for the exception – this is a user friendly message for presentation to any client application.

The MyBase keyword behaves like an object variable referring to the base class of the current instance of a class (ApplicationException). MyBase is commonly used to access base class members that are overridden or shadowed in a derived class. In particular, MyBase.New is used to explicitly call a base class constructor from a derived class constructor.

Next a small test client, written in VB.NET using Visual Studio.Net so we have both web form and code behind files:

WebForm1.aspx:

 <%@ Page Language="vb" AutoEventWireup="false" Codebehind="WebForm1.aspx.vb" Inherits="article_error_handling.WebForm1"%>
 <html>
 <body>
 </body>
 </html>

WebForm1.aspx.vb:

 Imports article_error_handling.CustomExceptions
 
 Public Class WebForm1
 
   Inherits System.Web.UI.Page
 
   Private Sub Page_Load(ByVal sender As System.Object, ByVal e As System.EventArgs) Handles MyBase.Load
     Try
       Throw New customException1()
     Catch ex As customException1
       Response.Write(ex.Message & " " & ex.HelpLink)
     End Try
   End Sub
 
 End Class 

This (admittedly simple) example you can extend to your own requirements.

Page Level Error Handling

Two facets:

  1. Page redirection
  2. Using the Page objects error event to capture exceptions

Taking each in turn:

Page redirection

Unforeseen errors can be trapped with the ErrorPage property of the Page object. This allows definition of a redirection URL in case of unhandled exceptions. A second step is required to enable such error trapping however – setting of the customErrors attribute of your web.config file, as follows:

 <configuration>
   <system.web>
     <customErrors mode="On">
     </customErrors>
   </system.web>
 </configuration> 

Then you’ll get your page level redirection rather than the page itself returning an error, so the following would work:

 <%@ Page ErrorPage="http://www.cymru-web.net/GenericError.htm" %> 

A useful option in addition to ‘on’ and ‘off’ for the mode attribute of customErrors is ‘RemoteOnly’ as when specified redirection will only occur if the browser application is running on a remote computer. This allows those with access to the local computer to continue to see the actual errors raised.

Page Objects Error Event

The Page object has an error event that is fired when an unhandled exception occurs in the page. It is not fired if the exception is handled in the code for the page. The relevant sub is Page_Error which you can use in your page as illustrated by the following code snippet:

 Sub Page_Error(sender as Object, e as EventArgs)
 
   dim PageException as string = Server.GetLastError().ToString()
   dim strBuild as new StringBuilder()
   strBuild.Append("Exception!")
   strBuild.Append(PageException)
 
   Response.Write(strBuild.ToString())
   Context.ClearError()
 
 End Sub 

As with page redirection this allows you to handle unexpected errors in a more user-friendly fashion. We’ll return to explain some of the classes used in the snippet above when we look at a similar scenario at the application level in the next section.

Application Level Error Handling

In reality you are more likely to use application level error handling rather than the page level just introduced. The Application_Error event of the global.asax exists for this purpose. Unsurprisingly, this is fired when an exception occurs in the corresponding web application.

In the global.asx you code against the event as follows:

 Sub Application_Error(sender as Object, e as EventArgs)
   'Do Something
 End Sub 

Unfortunately the EventArgs in this instance are unlikely to be sufficiently informative but there are alternative avenues including that introduced in the code of the last section – the HttpServerUtility.GetLastError method which returns a reference to the last error thrown in the application. This can be used as follows:

 Sub Application_Error(sender as Object, e as EventArgs)
 
   dim LastException as string = Server.GetLastError().ToString()
 
   Context.ClearError()
 
   Response.Write(LastException)
 
 End Sub 

Note that the ClearError method of the Context class clears all exceptions from the current request – if you don’t clear it the normal exception processing and consequent presentation will still occur.

Alternatively there is the HttpContext’s class Error property which returns a reference to the first exception thrown for the current HTTP request/ response. An example:

 Sub Application_Error(sender as Object, e as EventArgs)
 
   dim LastException as string = Context.Error.ToString()
 
   Context.ClearError()
 
   Response.Redirect("CustomErrors.aspx?Err=" & Server.UrlEncode(LastException))
 
 End Sub 

This illustrates one method for handling application errors – redirection to a custom error page which can then customise the output to the user dependent on the actual error received.

Finally, we also have the ability to implement an application wide error page redirect, via the customErrors section of web.config already introduced, using the defaultRedirect property:

 <configuration>
   <system.web>
     <customErrors mode="on" defaultRedirect="customerrors.aspx?err=Unspecified">
     <error statusCode="404" redirect="customerrors.aspx?err=File+Not+Found"/>
     </customErrors>
   </system.web>
 </configuration> 

Note this also demonstrates customised redirection via the error attribute. The HttpStatusCode enumeration holds the possible values of statusCode. This is too long a list to present here - see the SDK documentation.

Conclusion

I hope this has provided a useful introduction to the error handling facilities of ASP.NET. I further hope you’ll now go away and produce better code that makes use of these facilities! In summary:

  • Trap possible errors at design time via Option Explicit and Option Strict.
  • Consider in detail the possible errors that could occur in your application.
  • Use the Try … Catch … Finally construct to trap these errors at the page level and elsewhere.
  • It is generally considered poor programming practice to use exceptions for anything except unusual but anticipated problems that are beyond your programmatic control (such as losing a network connection). Exceptions should not be used to handle programming bugs!
  • Use application level exception handling (and perhaps also Page level) to trap foreseen and unforeseen errors in a user friendly way.

References

ASP.NET: Tips, Tutorial and Code
Scott Mitchell et al.
Sams

Programming Visual Basic .NET
Francesco Balena
Microsoft Press

.NET SDK Documentation

http://15seconds.com/issue/030102.htm 
15 Seconds : Web Application Error Handling in ASP.NET
 

http://msdn.microsoft.com/msdnmag/issues/02/11/NETExceptions/default.aspx
.NET Exceptions: Make the Transition from Traditional Visual Basic Error Handling to the Object-Oriented Model in .NET

State Management in ASP.NET

3. March 2003 15:31 by Chris in dev  //  Tags:   //   Comments (0)

Note that this article was first published on 02/01/2003. The original article is available on DotNetJohn.

Introduction

The web is a stateless medium – state is not maintained between client requests by default. Technologies must be utilized to provide some form of state management if this is what is required of your application, which will be the case for all but the simplest of web applications. ASP.NET provides several mechanisms to manage state in a more powerful and easier to utilize way than classic ASP. It is these mechanisms that are the subject matter for this article.

Page Level State - ViewState

Page level state is information maintained when an element on the web form page causes a subsequent request to the server for the same page – referred to as ‘postback’. This is appropriately called ViewState as the data involved is usually, though not necessarily, shown to the user directly within the page output.

The Control.ViewState property provides a dictionary object for retaining values between such multiple requests for the same page. This is the method that the page uses to preserve page and control property values between round trips.

When the page is processed, the current state of the page and controls is hashed into a string and saved in the page as a hidden field. When the page is posted back to the server, the page parses the view state string at page initialization and restores property information in the page.

ViewState is enabled by default so if you view a web form page in your browser you will see a line similar to the following near the form definition in your rendered HTML:

 <input type="hidden" name="__VIEWSTATE"
 value="dDwxNDg5OTk5MzM7Oz7DblWpxMjE3ATl4Jx621QnCmJ2VQ==" /> 

When a page is re-loaded two methods pertaining to ViewState are called: LoadViewState and SaveViewState. Page level state is maintained automatically by ASP.NET but you can disable it, as necessary, by setting the EnableViewState property to false for either the controls whose state doesn’t need to be maintained or for the page as a whole. For the control:

 <asp:TextBox id=”tbNamerunat=”serverEnableViewState=”false/> 

for the page:

 <%@ Page EnableViewState=”false” %> 

You can validate that these work as claimed by analyzing the information presented if you turn on tracing for a page containing the above elements. You will see that on postback, and assuming ViewState is enabled, that the LoadViewState method is executed after the Page class’ Init method has been completed. SaveViewState is called after PreRender and prior to actual page rendering.

You can also explicitly save information in the ViewState using the State Bag dictionary collection, accessed as follows:

 ViewState(key) = value 

Which can then be accessed as follows:

 Value = ViewState(key) 

It is important to remember that page level state is only maintained between consecutive accesses to the same page. When you visit another page the information will not be accessible via the methods above. For this we need to look at other methods and objects for storing state information.

Session Level State

What is a user session? Slightly simplifying it’s the interaction between a user’s first request for a page from a site until the user leaves the site again. Now what if you login into this site at your first request, assuming this is a site you have previously registered with. How does the application remember who you are for the rest of the ‘session’? Or if you have items in your shopping cart within an e-commerce web application how does the application ‘remember’ this information when you request to go to the checkout?

The answer may well be session state though the underlying mechanism by which this is achieved may be one of several options. ASP.NET creates a session (it reserves a section of memory) for a user when they first arrive at the site and assigns that user session a unique id that is tied to that section of memory. By default, ASP.NET then creates a cookie on the client that contains this same id. As this id will be sent with any subsequent http requests to this server ASP.NET will be able to match the user against the reserved section of memory. Further the application can store data related to that user session in this reserved memory space to maintain state between requests. It must be remembered that using session state is using server resources for every user of the site so you need to consider the resources required of items you choose to store.

An example:

 session(“name”)=”Chris Sully” 

sets a session variable with key ‘name’ and value “Chris Sully”. To retrieve/ display this we use:

 lblUserName.text=session(“name”) 

In this case assigning to the text property of a label web server control.

By default the resources associated with this session state maintenance are released if the user does not interact with the site for 20 minutes. If the user returns after a 20 minutes break a new session will have been created and any data associated with their previous session will have been lost. However, you can also destroy the session information for a user earlier within the application if so desired via

 Session.Abandon 

and you may also change the default value from 20 minutes. To redefine the session timeout property you use:

 Session.Timeout = 5 

Alternatively, and representing a more likely scenario, you would specify the value in your web.config file:

 <configuration>
   <system.web>
     <sessionState timeout=10 />
   </system.web>
 </configuration> 

which would half the default timeout from 20 to 10 minutes.

We'll return to some of the other sessionState properties in subsequent sections.

Looking at a little more detail at the session initialization process:

  1. User makes a request of the server
  2. ASP.NET retrieves the user’s sessionID via the cookie value passed in the HTTP request from the client computer. If one does not exist ASP.NET creates one, and raises the Session_OnStart event (which can be reacted to in the global.asax, amongst other locations).
  3. ASP.NET retrieves any data relating to the sessionID from a session data store. This data store may be of a variety of types, as we shall explore in subsequent sections.
  4. A System.Web.SessionState.SessionState object is created and populated with the data from the previous step. This is the object you access when using the shortcut session(“name”)=”Chris Sully”

There is also a Session_OnEnd which you can code against in your global.asax.

SQLServer

We can use SQLServer to store our session state. We can use other databases if we want to but ASP.NET includes built in support for SQLServer as well as other data stores that make life easier for developers. As you might expect, SQLServer should be the choice for storage of session information in high end, performance critical web applications.

To enable session state storage with SQLServer we'll need the infrastructure to support this, i.e. the tables and stored procedures. Microsoft has supplied these and they are located at

C:\winnt\Microsoft.net/framework/[version directory]

On my system:

C:\winnt\Microsoft.net/framework/v1.0.3705/InstallSQLState.sql

And you also have the TSQL to remove the setup: UninstallSQLState.sql.

So, to enable SQLServer session support open up and execute InstallSQLState.sql in query analyzer. If you then investigate what’s new in your SQLServer setup you will see a new database named AspState with 15 or so stored procedures used to insert and retrieve the associated session data. You won’t see any new tables! However, if you expand the tempdb database and view the tables therein, you will see two new tables: ASPStateTempApplications and ASPStateTempSessions which is where our session state information shall be held.

Now, all we need to do, as ASP.NET takes care of everything else for us, is modify web.config so that the ASP.Net application knows it should be using SQLServer for session state management:

 <configuration>
   <system.web>
     <sessionState mode=”sqlserversqlConnectionString=”connectionString/>
   </system.web>
 </configuration> 

where you should replace connectionString with that for you own machine.

If you want to test this, create a simple page which adds an item to the session state. This can be as simple as assigning a value to a session variable a la: session(“name”)=”Chris Sully”, perhaps simply placing this in the OnLoad sub of an otherwise blank aspx. View this in your browser. Don’t close the browser window after the page is loaded as you’ll end the session. Remember that after the first request by default the session will last 20 minutes.

If you now examine the contents of the ASPStateTempApplications table in tempdb either with Enterprise Manager or Query Analyser, you will see an entry corresponding to the above set session variable.

The other possibly important consideration is that the session data is stored external to the ASP.Net process. Hence, even if we restart the web server the information will remain available to subsequent user requests. We’ll look at another mechanism for such data isolation shortly.

Cookies

We’ve already introduced that cookies are central to the concept of session – they are the client half of how session state is maintained over browser requests. As well as using them for user identification purposes we can also use them to store custom session information. Cookie functionality is exposed via the HttpCookie class in .NET. This functionality may also be accessed via the Response and Request ASP.NET classes.

Cookies used in this way don’t tie in strongly with the inbuilt state management capabilities of .NET but they can provide useful custom session state management functionality.

To add a cookie (store on the client browser machine) you use the response object, e.g.:

 response.cookies(“ExampleCookie”)(“Time”)= datetime.now.tostring(“F”) 

i.e.,

 response.cookies(“CookieName”)(“keyName”) = value 

‘F’ refers to the Full date/time pattern (long time) by the way.

To retrieve a cookie you use the request object. The cookie is associated with the server URL and hence will be automatically appended to the request parameter collection and hence accessible as follows:

 TimeCookieSet = Request.Cookies(“ExampleCookie”)(“Time”) 

There are other options available. For example, you may set the Expires property of the cookie. This may be a fixed date or a length of time from the present date:

 Response.cookies(“ExampleCookie”).Expires = DateTime.FromString(“12/12/2003”)
 
 Response.cookies(“ExampleCookie”).Expires = DateTime.Now.AddMonths(6) 

Cookie Munging

You may wish to configure your applications to use session state without relying on cookies. There could be several reasons for this:

  1. You need to support old browser types that do not support cookies.
  2. You wish to cater for people who have chosen to disable cookie support within their browser.
  3. Certain types of domain name redirection mean that cookies / conventional state management will not work.

Cookie munging causes information regarding the session state id to be added to URL information. Thus the link between client and session data on the server is maintained.

It is simply enabled, as follows:

 <configuration>
   <system.web>
     <sessionState cookieless=”true/>
   </system.web>
 </configuration> 

If you change your web.config file to the above and then view a page which uses both the session object and postback you’ll see ASP.Net has inserted some extra information in the URL of the page. This extra information represents the session ID.

Cookie munging certainly does the job in instances where it is required but should be avoided where it is not, as it is insecure being susceptible to manual manipulation of the URL string.

Session State Server

Session State Server runs independently of ASP.Net, the current application, web server and (possibly) the server meaning you have good isolation of data and hence if there is a problem with the web server you may still be able to recover the user session data. This without the need for SQLServer. The State Server also presents a number of extra facilities including the choice of where and how to deploy the facility.

You can run the StateServer in the same process as the ASP.Net application (“InProc” – the default). This is similar to how classic ASP managed session state. Why would we want to do this when we have just lauded the benefits of data isolation? Performance is the answer – data stored in the same process is accessed quickly.

You can test this via restarting IIS (iisreset) when you know you have some session data in your application. You could also try restarting the ASP.NET application via the MMC snap-in, the effect is the same. This is achieved by removing and recreating the IIS application (right click-properties on the application sub-directory/ virtual directory).

A couple of simple test scripts would be:

1:

 <configuration>
   <system.web>
     <sessionState cookieless=”true/>
   </system.web>
 </configuration><%@ Page Language="VB" %>
 <html>
 <head>
 </head>
 <body>
   <form runat="server" ID="Form1">
   <% session("test")="test" %>
   </form>
 </body>
 </html>  

2:

 <%@ Page Language="VB" %>
 <html>
 <head>
 </head>
 <body>
   <form runat="server" ID="Form2">
   <%=session("test")%>
   </form>
 </body>
 </html> 

So, if you run 1, then 2 directly after without closing the browser window, the value will be maintained and you’ll see ‘test’ displayed. If you reset IIS or the IIS application in between the browser requests session information will be lost.

You can also run State Server out of process (“Out-of-Proc”) which means that session state information can be stored in alternative locations to the ASP.NET process space. Hence, if there is a problem with the application or web server state is maintained. In this scenario, the ASPNETState process (aspnet_state.exe) is used, which runs as a service, storing session data in memory.

First thing we need to do therefore is make sure this service is up and running. This can be done via the command line:

Net start aspnet_state

Though if you’re going to this out of process support for an application you will want to setup the service to start on machine boot up (admin tools – services).

Next it’s back to that Web.Config file to tell the application to use the facility:

 <configuration>
   <system.web>
     <sessionState mode=”stateserverstateConnectionString=”tcpip=127.0.0.1:80/>
   </system.web>
 </configuration> 

In actual fact the stateConnectionString attribute is not required when pointing to a port on the local machine, as it is above (which supplies the default setting), but is important if you wish to use another machine and port for extra security/ reliability. It is included to demonstrate the syntax.

If you now go back and try the session maintenance test you won’t lose that session data.

Application level state

Using Application state is similar to using session state: the HttpApplicationState class contains an Application property that provides dictionary access to variables:

 Application(“key”) = value 

The difference with session state is that data is stored per IIS application as opposed to per user session. Thus if you set up:

 application(“name”)=”Chris Sully” 

in your global.asax file for the application this value is available to all pages of your application and will be the same value for all users who visit.

Setting state information is not limited to global.asax – any page of the application being run by any user can do this. There is an implication here – multiple users trying to change the value of an application variable could lead to data inconsistencies. To prevent this the HttpApplicationState class provides the lock method that ensures only one user (actually a process thread) is accessing the application state object at any one time. Thus you should lock before amending an application value and unlock immediately afterwards so others have access to the variable.

Other options for state management: caching

Another method of storing data on the server is via the cache class. This effectively extends the capabilities of storing data at the application level. Frequently used data items may be stored in the cache memory for presentation. For example, data for a drop down list may be stored in a cached object so that the cached copy is used rather than obtaining the data from the database on every occasion.

You may store your objects in cache memory and set the properties of the cache to control when the cache might release its resources. For example, you might use the TimeSpan property that specifies how long the item should remain in the cache after it is last accessed. You can also specify a dependency of the cached item on a datasource, for example an XML file. If this file is changed the chance can be programmed to react to this event accordingly and update the cached object.

For further information on this subject see my article on caching on dotnetjohn.

Conclusion

I hope this article has provided a useful overview of the state management options and support in .NET. With all the forms of state management there exists a balance between the needs of the application and the associated resources used.

In particular, be aware that page level state is enabled by default and should be disabled if not required, particularly if you are manipulating significant amounts of data within DataBound controls. As well as using server resources, leaving ViewState enabled in such a situation will increased the page size, and hence download times, significantly. In such a situation it may be better to cache the data on the server for the postback rather than transmit the data in the ViewState, or even rely on the data caching facilities of your chosen DBMS.

ASP.Net provides significantly extended support for session state maintenance via SQLServer and Session State Server. The choice is down to the needs of your application and, in particular, how important data isolation from your IIS application is.

References

.NET Framework SDK documentation

ASP.NET: Tips, Tutorials and Code Sams Mitchell et al.

Securing an ASP.Net application

30. January 2003 15:24 by Chris in dev  //  Tags:   //   Comments (0)

Note that this article was first published on 02/01/2003. The original article is available on DotNetJohn, where the code is also available for download. 

Introduction

This article considers and develops a reasonably secure login facility for use within an Internet application utilizing the inbuilt features of ASP.Net. This login facility is intended to protect an administrative section of an Internet site where there are only a limited number of users who will have access to that section of the site. The rest of the site will be accessible to unauthorized users. This problem specification will guide our decision-making.

Also presented are suggestions as to how this security could be improved if you cross the boundary of ASP.Net functionality into supporting technologies. Firstly, however I'll provide an overview of web application security and the features available in ASP.Net, focusing particularly on forms based authentication, as this is the approach we shall eventually use as the basis for our login facility.

Pre-requisites for this article include some prior knowledge of ASP.Net (web.config, security, etc.) and related technologies (e.g. IIS) as well as a basic understanding of general web and security related concepts, e.g. HTTP, cookies.

Web application security: authentication and authorization

Different web sites require different levels of security. Some portions of a web site commonly require password-protected areas and there are many ways to implement such security, the choice largely dependent on the problem domain and the specific application requirements.

Security for web applications is comprised of two processes: authentication and authorization. The process of identifying your user and authenticating that they are who they claim they are is authentication. Authorization is the process of determining whether the authenticated user has access to the resource they are attempting to access.

The authentication process requires validation against an appropriate data store, commonly called an authority, for example an instance of Active Directory.

ASP.Net provides authorization services using both the URL and the file of the requested resource. Both checks must be successful for the user to be allowed to proceed to access said resource.

Authentication via ASP.Net

ASP.Net arrives complete with the following authentication providers that provide interfaces to other levels of security existing within and/ or external to the web server computer system:

  • integrated windows authentication using NTLM or Kerberos.
  • forms based authentication
  • passport authentication

As with other configuration requirements web.config is utilized to define security settings such as:

  • the authentication method to use
  • the users who are permitted to use the application
  • how sensitive data should be encrypted

Looking at each authentication method in turn with a view to their use in our login facility:

Integrated Windows

This is a secure method but it is only supported by Internet Explorer and therefore most suited to intranet situations where browser type can be controlled. In fact it is the method of choice for Intranet applications. Typically it involves authentication against a Windows domain authority such as Active Directory or the Security Accounts Manager (SAM) using Windows NT Challenge/ Response (NLTM).

Integrated Windows authentication uses the domain, username and computer name of the client user to generate a ‘challenge’. The client must enter the correct password which will causes the correct response to be generated and returned to the server.

In order for integrated Windows authentication to be used successfully in ASP.Net the application needs to be properly configured to do so via IIS – you will commonly want to remove anonymous access so users are not automatically authenticated via the machines IUSR account. You should also configure the directory where the protected resource is located as an application, though this may already be the case if this is the root directory of your web application.

Consideration of suitability

As integrated Windows authentication is specific to Internet Explorer it is not a suitable authentication method for use with our login facility that we have specified we wish to use for Internet applications. In such a scenario a variety of browser types and versions may provide the client for our application and we would not wish to exclude a significant percentage of our possible user population from visiting our site.

Forms based authentication

This is cookie-based authentication by another name and with a nice wrapper of functionality around it. Such authentication is commonly deemed sufficient for large, public Internet sites. Forms authentication works by redirecting unauthenticated requests to a login page (typically username and a password are collected) via which the credentials of the user are collected and validated. If validated a cookie is issued which contains information subsequently used by ASP.Net to identify the user. The longevity of the cookie may be controlled: for example you may specify that the cookie is valid only for the duration of the current user session.

Forms authentication is flexible in the authorities against which it can validate. . For example, it can validate credentials against a Windows based authority, as per integrated Windows, or other data sources such as a database or a simple text file. A further advantage over integrated Windows is that you have control over the login screen used to authenticate users.

Forms authentication is enabled in the applications web.config file, for example:

 <configuration>
   <system.web>
     <authentication mode="Forms">
       <forms name=".AUTHCOOKIE" loginURL="login.aspx" protection="All" />
     </authentication>
     <machineKey validationKey="Autogenerate" decryption key="Autogenerate" validation"SHA1" />
     <authorization>
       <deny users="?" />
   <authorization>
   </system.web>
 </configuration> 

This is mostly self-explanatory. The name element refers to the name of the cookie. The machineKey section controls the decryption that is used. In a web farm scenario with multiple web servers the key would be hard-coded to enable authentication to work. Otherwise different machines would be using different validation keys! The ‘?’ in the authorization section above by the way represents the anonymous user. An ‘*’ would indicate all users.

Within the login page you could validate against a variety of data sources. This might be an XML file of users and passwords. This is an insecure solution however so should not be used for sensitive data though you could increase security by encrypting the passwords.

Alternatively you can use the credentials element of the web.config file, which is a sub-element of the <forms> element, as follows:

 <credentials passwordFormat=”Clear>
   <user name=”Chrispassword=”Moniker/>
   <user name=”Mariapassword=”Petersburg/>
 </credentials> 

Using this method means there is very little coding for the developer to undertake due to the support provided by the .NET Framework, as we shall see a little later when we revisit this method.

Note also the passwordFormat attribute is required, and can be one of the following values:

Clear
Passwords are stored in clear text. The user password is compared directly to this value without further transformation.

MD5
Passwords are stored using a Message Digest 5 (MD5) hash digest. When credentials are validated, the user password is hashed using the MD5 algorithm and compared for equality with this value. The clear-text password is never stored or compared when using this value. This algorithm produces better performance than SHA1.

SHA1
Passwords are stored using the SHA1 hash digest. When credentials are validated, the user password is hashed using the SHA1 algorithm and compared for equality with this value. The clear-text password is never stored or compared when using this value. Use this algorithm for best security.

What is hashing? Hash algorithms map binary values of an arbitrary length to small binary values of a fixed length, known as hash values. A hash value is a unique and extremely compact numerical representation of a piece of data. The hash size for the SHA1 algorithm is 160 bits. SHA1 is more secure than the alternate MD5 algorithm, at the expense of performance.

At this time there is no ASP.Net tool for creating hashed passwords for insertion into configuration files. However, there are classes and methods that make it easy for you to create them programmatically, in particular the FormsAuthentication class. It’s HashPasswordForStoringInConfigFile method can do the hashing. At a lower level, you can use the System.Security.Cryptography classes, as well. We'll be looking at the former method later in this article.

The flexibility of the authentication provider for Forms Authentication continues as we can select SQLServer as our data source though the developer needs then to write bespoke code for validating user credentials against the database. Typically you will then have a registration page to allow users to register their login details which will then be stored in SQLServer for use when the user then returns to a protected resource and is redirected to the login page by the forms authentication, assuming the corresponding cookie is not still in existence.

This raises a further feature - we would want to give all users access to the registration page so that they may register but other resources should be protected. Additionally, there may be a third level of security, for example an admin page to list all users registered with the system. In such a situation we can have multiple system.web sections in our web.config file to support the different levels of authorization, as follows:

 <configuration>
   <system.web>
     <authentication mode="Forms">
       <forms name=".AUTHCOOKIE" loginURL="login.aspx" protection="All" />
     </authentication>
     <machineKey validationKey="Autogenerate" decryption key="Autogenerate" validation"SHA1" />
     <authorization>
       <deny users="?" />
     <authorization>
   </system.web>
 
   <location path="register.aspx">
     <system.web>
       <authorization>
         <allow users="*,?" />
       </authorization>
     </system.web>
   </location>
 
   <location path="admin.aspx">
     <system.web>
       <authorization>
         <allow users="admin " />
         <deny users="*" />
       </authorization>
     </system.web>
   </location>
 </configuration> 

Thus only the admin user can access admin.aspx, whilst all users can access register.aspx so if they don't have an account already they can register for one. Any other resource request will cause redirection to login.aspx, if a valid authentication cookie by the name of .AUTHCOOKIE isn't detected within the request. On the login page you would provide a link to register.aspx for users who require the facility.

Alternatively you can have multiple web.config files, with that for a sub-directory overriding that for the application a whole, an approach that we shall implement later for completeness.

Finally, you may also perform forms authentication in ASP.Net against a Web Service, which we won’t consider any further as this could form an article in itself, and against Microsoft Passport. Passport uses standard web technologies such as SSL, cookies and Javascript and uses strong symmetric key encryption using Triple DES (3DES) to deliver a single sign in service where a user can register once and then has access to any passport enabled site.

Consideration of suitability

Forms based authentication is a flexible mechanism supporting a variety of techniques of various levels of security. Some of the available techniques may be secure enough for implementation if extended appropriately. Some of the techniques are more suited to our problem domain than others, as we’ll discuss shortly.

In terms of specific authorities:

Passport is most appropriately utilized where your site will be used in conjunction with other Passport enabled sites and where you do not wish to maintain your own user credentials data source. This is not the case in our chosen problem domain where Passport would both be overkill and inappropriate.

SQLServer would be the correct solution for the most common web site scenario where you have many users visiting a site where the majority of content is protected. Then an automated registration facility is the obvious solution with a configuration as per the web.config file just introduced. In our chosen problem domain we have stated that we potentially have only a handful of users accounts accessing a small portion of the application functionality and hence SQLServer is not necessarily the best solution, though is perfectly viable.

Use of the credentials section of the forms element of web.config or a simple text/ XML file would seem most suitable for this problem domain. The extra security and simplicity of implementation offered by the former makes this the method of choice.

Authorization via ASP.Net

As discussed earlier this is the second stage of gaining access to a site: determining whether an authenticated user should be permitted access to a requested resource.

File authorization utilizes windows security services access control lists (ACLs) – using the authorized identity to do so. Further, ASP.Net allows further refinement based on the URL requested, as you may have recognized in the examples already introduced, as well as the HTTP request method attempted via the verb attribute, valid values of which are: GET, POST, HEAD or DEBUG. I can't think of many occasions in which you'd want to use this feature but you may have other ideas! You may also refer to windows roles as well as named users.

A few examples to clarify:

 <authorization>
   <allow users=”Chris/>
   <deny users=”Chris/>
   <deny users=”*” />
 </authorization> 

You might logically think this would deny all users access. In fact Chris still has access, as when ASP.Net finds a conflict such as this it will use the earlier declaration.

 <authorization>
   <allow roles=”Administrators/>
   <deny users=”*” />
 </authorization>
 
 <authorization>
   <allow verbs=”GET, POST/>
 </authorization> 

Impersonation

Impersonation is the concept whereby an application executes under the context of the identity of the client that is accessing the application. This is achieved by using the access token provided by IIS. You may well know that by default the ASPNET account is used to access ASP.Net resources via the Aspnet_wp.exe process. This, by necessity, has a little more power than the standard guest account for Internet access, IUSR, but not much more. Sometimes you may wish to use a more powerful account to access system resources that your application needs. This may be achieved via impersonation as follows:

 <system.web>
   <identity impersonate=”true/>
 </system.web> 

or you may specify a particular account:

 <system.web>
   <identity impersonate=”falseuserName=”domain\sullycpassword=”password/>
 </system.web> 

Of course you will need to provide the involved accounts with the necessary access rights to achieve the goals of the application. Note also that if you don’t remove IUSR from the ACLs then this is the account that will be used – this is unlikely to meet your needs as this is a less powerful account than ASPNET.

ASP.Net will only impersonate during the request handler - tasks such as executing the compiler and reading configuration data occur as the default process account. This is configurable via the <processModel> section of your system configuration file (machine.config). Care should be taken however not to use an inappropriate (too powerful) account which exposes your system to the threat of attacks.

The situation is further complicated by extra features available in IIS6 … but we’ll leave those for another article perhaps as the situation is complex enough!

Let’s move onto developing a login solution for our chosen problem domain.

Our Chosen Authentication Method – how secure is it?

We've chosen forms based authentication utilizing the web.config file as our authority. How secure is the mechanism involved? Let's consider this by examining the process in a little more detail. As a reminder, our application scenario is one of a web site where we've put content which we want to enable restricted access to in a sub-directory named secure. We have configured our web.config files to restrict access to the secure sub-directory, as described above. We deny access to the anonymous users (i.e. unauthenticated users) to the secure sub-directory:

 <authorization>
   <deny users="?" />
 </authorization> 

If someone requests a file in the secure sub-directory then ASP.Net URL authentication kicks in - ASP.Net checks to see if a valid authentication cookie is attached to the request. If the cookie exists, ASP.Net decrypts it, validates it to ensure it hasn't been tampered with, and extracts identity information that it assigns to the current request. Encryption and validation can be turned off but are enabled by default. If the cookie doesn't exist, ASP.Net redirects the request to the login page. If the login is successful, the authentication cookie is created and passed to the user’s browser. This can be configured to be a permanent cookie or a session-based cookie. Possibly slightly more secure is a session-based cookie where the cookie is destroyed when the user leaves the application or the session times out. This prevents someone else accessing the application from the user’s client machine without having to login.

Given the above scenario we have two security issues for further consideration:

    1. How secure is the cookie based access? Note above that encryption and validation are used by default. How secure are these in reality?

      Validation works exactly the same for authentication cookies as it does for view state: the <machineKey> element's validationKey is appended to the cookie, the resulting value is hashed, and the hash is appended to the cookie. When the cookie is returned in a request, ASP.Net verifies that it wasn't tampered with by rehashing the cookie and comparing the new hash to the one accompanying the cookie. Encryption works by encrypting the cookie, hash value and all with <machineKey>'s decryptionKey attribute. Validation consumes less CPU time than encryption and prevents tampering. It does not, however, prevent someone from intercepting an authentication cookie and reading its contents.

      Encrypted cookies can't be read or altered, but they can be stolen and used illicitly. Time-outs are the only protection a cookie offers against replay attacks, and they apply to session cookies only. The most reliable way to prevent someone from spoofing your site with a stolen authentication cookie is to use an encrypted communications link (HTTPS). Talking of which, this is one situation when you might want to turn off both encryption and validation. There is little point encrypting the communication again if you are already using HTTPS.

      Whilst on the subject of cookies, remember also that cookie support can be turned off via the client browser. This should also be borne in mind when designing your application.

  • How secure is the logging on procedure to a web form? Does it use clear text username and password transmission that could be susceptible to observation, capture and subsequent misuse?

Yes is the answer. Thus if you want a secure solution but don't want the overhead of encrypting communications to all parts of your site, consider at least submitting user names and passwords over HTTPS, this assuming your web hosting service provides this.

To reiterate, the forms security model allows us to configure keys to use for encryption and decryption of forms authentication cookie data. Here we have a problem - this only encrypts the cookie data - the initial login screen data, i.e. email / password is not encrypted. We are using standard HTTP transmitting data in clear text which is susceptible to interception. The only way around this is to go to HTTPS and a secure communication channel.

Which perhaps begs the question – what is the point of encrypting the cookie data if our access is susceptible anyway if we are using an unsecured communication channel? Well, if we enable cookie authentication when we first login then subsequent interaction with the server will be more secure. After that initial login a malicious attacker could not easily gain our login details and gain access to the site simply by examining the contents of the packets of information passed to and from the web server. However, note the earlier comments on cookie theft. It is important to understand these concepts and the impact our decisions have on the overall security of our application data.

It is perhaps unsurprising given the above that for the most secure applications:

  1. A secure HTTPS channel is used whenever dealing with username/ password/ related data.
  2. Cookies are not exclusively relied upon: often though recall of certain information is cookie based important transactions still require authorization via an encrypted password or number.

It is up to the application architect/ programmer to decide whether this level of security is appropriate to their system.

Finally, before we actually come up with some code remember that forms based security secures only ASP.Net resources. It doesn’t protect HTML files, for example. Just because you have secured a directory using web.config / ASP.Net doesn’t mean you have secured all files in that directory. To do this you could look at features available via IIS.

The 'Application'

Finally to the code and making our ASP.Net application as secure as possible using the facilities ASP.Net provides. Taking the above described scenario where we have a secure sub-directory the files within which we wish to protect. However, we anticipate there will only be a handful of users who will need access to the directory and hence this is a suitable problem domain to be addressed with a web.config based authority solution as earlier decided.

Starting with our web.config file. We can secure the sub-directory either via the location element, as described above, but just to demonstrate the alternative double web.config based approach, here is the web.config at the root level:

 <configuration>
   <system.web>
     <authentication mode="Forms">
       <forms name=".AUTHCOOKIE" loginUrl="login_credentials.aspx" protection="All">
         <credentials passwordFormat="Clear">
           <user name="chris" password="password" />
         </credentials>
       </forms>
     </authentication>
     <machineKey validationKey="AutoGenerate" decryptionKey="AutoGenerate" validation="SHA1" />
     <authorization>
       <allow users="*" />
     </authorization>
   </system.web>
 </configuration> 

You can see that this sets up forms based security enabling validation and encryption and specifies a credentials list of one user, currently in Cleartext format but shortly we'll see how to encrypt the password via SHA1. You'll also see that this file doesn’t actually restrict user access at all so URL based authentication will not be used at the root level of our application. However, if we extend the configuration for the secure sub-directory via an additional web.config file:

 <configuration>
   <system.web>
     <authorization>
       <deny users="?" />
     </authorization>
   </system.web>
 </configuration> 

Then if a user attempts to access an ASP.Net resource in secure they will be dealt with according to the combination of directives in the web.config file and inherited from the parent web.config file, and machine.config file for that matter.

Onto the login file: you will need form fields to allow entry of username and password data. Note that security will be further improved by enforcing minimum standards on passwords (e.g. length), which can be achieved by validation controls. There is only minimal validation in the example. Note that there is no facility to request a ‘persistent cookie’ as this provides a minor security risk. It is up to you to decide whether a permanent cookie is acceptable in your application domain.

Then in the login file, login_credentials.aspx, after allowing the user to enter username and password data, in the sub executed on the server when the submit form button is clicked we validate the entered data against the web.config credentials data, achieved simply as follows:

 If FormsAuthentication.Authenticate(Username.Value, UserPass.Value) Then 
   FormsAuthentication.RedirectFromLoginPage (UserName.Value, false) 
 Else 
   Msg.text="credentials not valid" 
 End If 

Could it be any simpler? The FormsAuthentication object knows what authority it needs to validate against as this has been specified in the web.config file. If the user details match, the code proceeds to redirect back to the secured resource and also sets the cookie for the user session based on the user name entered. The parameter 'false' indicates that the cookie should not be permanently stored on the client machine. Its lifetime will be the duration of the user session by default. This can be altered if so desired.

Back to web.config to improve the security. The details are being stored unencrypted – we can encrypt them with the aforementioned HashPasswordForStoringInConfigFile of the FormsAuthentication class, achieved simply as follows:

 Private Function encode(ByVal cleartext As String) As String
   encode = FormsAuthentication.HashPasswordForStoringInConfigFile(cleartext, "SHA1")
   Return encode
 End Function 

This is the key function of the encode.aspx file provided with the code download, which accepts a text string (the original password – ‘password’ in this case) and outputs a SHA1 encoded version care of the above function.

Thus, our new improved configuration section of our root web.config file becomes:

 <credentials passwordFormat="SHA1">
   <user name="chris" password="5BAA61E4C9B93F3F0682250B6CF8331B7EE68FD8" />
 </credentials> 

To summarize the involved files:

Root/web.config root web.config file
Root/webform1.aspx test page
Root/login_credentials.aspx login page
 
Root/encode.aspx form to SHA1 encode a password for <credentials>
Root/secure/web.config directives to override security for this sub-directory to deny anonymous access
Root/secure/webform1.aspx test page

Conclusions

We’ve looked at the new security features of ASP.Net focusing particularly on an application scenario where forms based authentication uses the credentials section of web.config, but presenting this in the context of wider security issues.

In summary you should consider forms based authentication when:

  • User names and passwords are stored somewhere other than Windows Accounts (it is possible to use forms authentication with Windows Accounts but in this case Integrated Windows authentication may well be the best choice).
  • You are deploying your application over the Internet and hence you need to support all browsers and client operating systems.
  • You want to provide your own user interface form as a logon page.

You should not consider forms based authentication when:

  • You are deploying an application on a corporate intranet and can take advantage of the more secure Integrated Windows authentication.
  • You are unable to perform programmatic access to verify the user name and password.

Further security considerations for forms based authentication:

  • If users are submitting passwords via the logon page, you can (should?) secure the channel using SSL to prevent passwords from being easily obtained by hackers.
  • If you are using cookies to maintain the identity of the user between requests, you should be aware of the potential security risk of a hacker "stealing" the user's cookie using a network-monitoring program. To ensure the site is completely secure when using cookies you must use SSL for all communications with the site. This will be an impractical restriction for most sites due to the significant performance overhead. A compromise available within ASP.Net is to have the server regenerate cookies at timed intervals. This policy of cookie expiration is designed to prevent another user from accessing the site with a stolen cookie.

Finally, different authorities are appropriate for form-based authentication for different problem domains. For our considered scenario where the number of users was limited as we were only protecting a specific administrative resource credentials / XML file based authorities are adequate. For a scenario where all site information is ‘protected’ a database authority is most likely to be the optimal solution.

References

ASP.Net: Tips, Tutorial and Code
Scott Mitchell et al.
Sams

.Net SDK documentation

Various online articles, in particular:

ASP.Net Security: An Introductory Guide to Building and Deploying More Secure Sites with ASP.Net and IIS -- MSDN Magazine, April 2002
http://msdn.microsoft.com/msdnmag/issues/02/04/ASPSec/default.aspx
An excellent and detailed introduction to IIS and ASP.Net security issues.

http://msdn.microsoft.com/library/default.asp?url=/library/en-us/dnbda/html/authaspdotnet.asp
Authentication in ASP.Net: .Net Security Guidance

You may download the code here.

About the author

I am Dr Christopher Sully (MCPD, MCSD) and I am a Cardiff, UK based IT Consultant/ Developer and have been involved in the industry since 1996 though I started programming considerably earlier than that. During the intervening period I've worked mainly on web application projects utilising Microsoft products and technologies: principally ASP.NET and SQL Server and working on all phases of the project lifecycle. If you might like to utilise some of the aforementioned experience I would strongly recommend that you contact me. I am also trying to improve my Welsh so am likely to blog about this as well as IT matters.

Month List