Windows Phone 7.8 update issues

4. February 2013 09:16 by Chris in Windows Phone  //  Tags:   //   Comments (0)

I installed Windows Phone 7.8 over the weekend. It has not gone smoothly. First Zune refused to recognise the phone - it did eventually - and this morning my Nokia 800 locked for the first time ever and I had to google a soft reset which worked thankfully, from here: "Press and hold the Volume down and Power buttons until it vibrates, after that you have to release all the keys then phone will vibrate 3 times." This happened while playing a podcast. I have had issues with the media player in the past, chiefly it not maintaining the state of where I was in the podcast, but nothing this major.

Here's hoping for few further issues with 7.8 which, by the way, doesn't seem to add a great deal to 7.5. So I might be googling how to rollback to 7.5 in the not too distant future if the problems persist. Not to downplay the complexity of any development when Microsoft is effectively supporting 2 platforms but 7.8 has been so long in developing one would have though they could have got it right. Too many issues and my loyalty to Windows Phone will be severely tested.

 

 

Bootstrap, other javascript tools/ frameworks and keeping up with appearances

25. January 2013 14:28 by Chris in dev  //  Tags:   //   Comments (0)

Bootstrap, other Javascript Tools/ Frameworks and Keeping Up With Appearances

I don't know about other developers but I have this 'technical development topics list' ... products/ technologies/ concepts I've come across in passing but don't seem to consistently manage to dig into. This is particularly an issue currently as ... 

a) there is just so much "stuff" to learn about in the (Microsoft) web software development space. 10 years ago it was far easier to keep abreast of the main development technologies. Now it's not possible for one person to cover all the bases and specialisation is required, at least if you are going to dig into anything in any depth

b) closely related is the changing landscape of devices and development for those devices, which increases the aforementioned complexity still further; native vs cross platform/ device anyone?

Anyway, in an attempt to facilitate making some personal headway I thought if I try and blog about some of these topics, once a week say. A laudible idea but let's see how this goes. Not well so far as this post has been sitting here unfinished for a month!

 

The New Way

A few years ago the ASP.NET web dev picture was different in several ways, including that life for the developer was simpler. We had ASP.NET web forms with postbacks, server controls and associated viewstate. Those server controls gradually got better in terms of user experience. People complained about how the web forms approach didn't facilitate unit testing and about the clunkiness of the state management. Ajax became more popular and this didn't fit that well into the Web forms way. Similarly Javascript, particularly in the form of jQuery became more popular as processing moved to the client in the drive for the more responsive UX. RESTful services are becoming the order of the day.

The "cutting edge developer" called for a more testable framework with less 'plumbing' and more control. Now we have ASP.NET MVC (as well as Microsoft web pages) which seems a better fit with this new Javascript-centric world than web forms. Of course there is then the option of dumping Microsoft/ Visual Studio completely for client development, and there is increasingly the option to continue this at the server side with technologies such as node.js.

Back to the client. With increased use of Javascript/ reduced use of Microsoft plumbing code have come a host of competing "frameworks" to supposedly make life easier for the developer. If you don't spend half your life trying to work out which framework you should be using for a given project that is. K.Scott Allen (check out his Pluralsight videos) was on DotNetRocks recently and one of his hopes for the year was that the web dev landscape simplified. I agree ... if we could go a little way back to it being more obvious which frameworks to use, and when, whilst also maintaining the benefits of this "brave new world" this would surely be a happier situation. So, Scott rattled of a few technologies/ projects/ frameworks during that show so let's very briefly cover those and I'll plan to return to cover more of them, and probably other new ones that have popped up in the interim, in subsequent posts. Oh, and these are from my notes from the show to follow up on so I may have also added one or two more than were originally stated! Some of the headline descriptions provided by the tools' sites are not very useful but ...

  • Knockout (http://knockoutjs.com/) - 'simplify dynamic JavaScript UIs by applying the MVVM pattern'
  • Backbone (http://backbonejs.org/) - 'Backbone.js gives structure to web applications by providing models with key-value binding and custom events, collections with a rich API of enumerable functions, views with declarative event handling, and connects it all to your existing API over a RESTful JSON interface.'
  • Spine (http://spinejs.com/) - 'Build Awesome JavaScript MVC Applications' - useful overview description right there!
  • Angular (http://angularjs.org/) - 'HTML enhanced for web apps!' - ditto.
  • Masonry (http://masonry.desandro.com/) - 'A dynamic layout plugin for jQuery'
  • Modernizr (http://modernizr.com/) - 'A JavaScript library that detects HTML5 and CSS3 features in the user’s browser'
  • Bootstrap (http://twitter.github.com/bootstrap/) - 'Sleek, intuitive, and powerful front-end framework for faster and easier web development'
  • CoffeeScript (http://coffeescript.org/) - 'CoffeeScript is a little language that compiles into JavaScript. Underneath all those awkward braces and semicolons, JavaScript has always had a gorgeous object model at its heart. CoffeeScript is an attempt to expose the good parts of JavaScript in a simple way. '
  • Typescript (http://www.typescriptlang.org/) - 'TypeScript is a language for application-scale JavaScript development. TypeScript is a typed superset of JavaScript that compiles to plain JavaScript. Any browser. Any host. Any OS. Open Source. '
  • Skeleton - I used for my own website and elsewhere (http://www.getskeleton.com)  'A Beautiful Boilerplate for Responsive, Mobile-Friendly Development'. See also this tutorial I found useful.
  • LESS (http://lesscss.org/) - 'LESS extends CSS with dynamic behavior such as variables, mixins, operations and functions.'
  • Further, though perhaps a little different intended scope, Telerik's KendoUI has also caught my eye.

A little more on Bootstrap

So, lots of interesting play things in the arena of client web technologies, likely to help/ confuse us poor web developers but let's have a little closer look at Bootstrap. Of the above it is most similar to Skeleton but whereas Skeleton has specifically targeted CSS support for a 12 column 960px grid system with supporting media queries and a few extras in the form of consistent styling of buttons, forms and typography the scope of Bootstrap is a little larger, seemingly a superset including: 

  • Scaffolding Global styles for the body to reset type and background, link styles, grid system, and two simple layouts.
  • Base CSS Styles for common HTML elements like typography, code, tables, forms, and buttons. Also includes Glyphicons, a great little icon set.
  • Components Basic styles for common interface components like tabs and pills, navbar, alerts, page headers, and more.
  • JavaScript plugins Similar to Components, these JavaScript plugins are interactive components for things like tooltips, popovers, modals, and more.

 And that's enough for now. I hope to have a play shortly and report back further.

 

A Welsh learner typing tip

21. January 2013 21:17 by Chris in Welsh  //  Tags:   //   Comments (0)

From http://desktoppub.about.com/cs/finetypography/ht/circumflex.htm 

Under Windows hold down ALT while typing the appropriate number code on your numeric keypad to create characters with circumflex accent marks e.g. â 0226, ô 0244

In Word: Ctrl-Shft-^ then the letter ... but this is word specific and won't work generally in Windows.

Staying with Word, if you want a ë it's CTRL-':'m i.e CTRL-SHIFT ';' then the 'e' or another character. You can get at this generally in Windows with ALT-137, a different scheme from above (see http://www.edu.dudley.gov.uk/ict/software/word/accents.htm). I'll get to investigating the difference between the two schemes.

 

18/05/2-13

In Welsh the circumflex is known as hirnod 'long sign', acen grom 'crooked accent' and also colloquially as to bach 'little roof'. It lengthens a vowel (a, e, i, o, u, w, y), and is used particularly to differentiate between homographs; e.g. tan and tân, ffon and ffôn, gem and gêm, cyn and cŷn, or gwn and gŵn. I add this as I needed 'ŷ' (the code in this case is 0177) as the standard way to insert a circumflex, Ctrl-Shift-6, then 'y', didn't work, so you type the code then ALT-X. Note this is different from the general Windows approach above.

Note also that while â, ê, î, ô and and û will work with the CTRL-Shift approach ŵ as well as ŷ won't, the code for the former being 0175.

P.S. you can always also Insert-Symbol as well!

Shame it can't all be easier in this day and age!

References

http://en.wikipedia.org/wiki/Circumflex

http://www.fileformat.info/info/unicode/char/177/browsertest.htm

http://www.200words-a-day.com/typing-welsh-characters.html

 

 

Welsh Learner Resources

21. January 2013 19:33 by Chris in Welsh  //  Tags:   //   Comments (0)

Following up my translation resources posts, here are som other resources for Welsh learners (work in progress).

 

S4C

Hwb

BBC - try the 'BBC vocab' to help read some Welsh

http://www.bbc.co.uk/cymru/dysgu/ - http://www.bbc.co.uk/cymru/dysgu/dysgucymraeg/ 

Catchphrase

Pigion http://www.bbc.co.uk/radiocymru/safle/pigion/ - seem to have been budget cuts as doesn't offer the supporting information that it used to

Grammar including all lessons as a PDF

SSIW http://www.saysomethinginwelsh.com

Also their wiki, http://ssiw.pbworks.com, e.g. http://ssiw.pbworks.com/w/page/36027266/Fiction%20and%20Poetry 

 

Clic Clonc http://www.telesgop.co.uk/clicclonc/

Facebook

http://www.facebook.com/#!/learnwelshlanguage

http://www.facebook.com/#!/LearnWelsh 

 http://www.unilang.org/course.php?res=84

 

Old Articles

9. January 2013 16:53 by Chris in dev  //  Tags:   //   Comments (0)

I've started adding my old technical articles to this blog ascribed the dates they were originally published but I'll list the articles here as well though some apparently weren't permalinks so may have to remember/ dig out the originals of these. Somewhat surprisingly given many articles date back to 2003 the vast majority are still relevant to varying degrees. Italics and/ or italicised comments indicate those which have suffered with the passage of time, e.g. mobile development has moved on apace and exams have a habit of being deprecated.

If you do link through there seems to be no rhyme nor reason as to what ratign an article gets, as far as I can see anyway!

 

Welsh Online Dictionary Resources

7. January 2013 17:22 by Chris in Welsh  //  Tags:   //   Comments (1)

I'm learning Welsh - Dysgwr Cymraeg ydw i! It does seem a bit of a slog but, stating the obvious, learning a language is difficult. In my 6th year now of evening classes I recently attended a 2 day intensive new year revision course at Cardiff University motivated by the new year and accompanying resolutions. I even made it to my first Clonc Yn Y Cwtsh. This week I had better do some homework.

Intensive/ immersive is the way to go I think if you want to develop the skills quickly and efficiently, and if you have the time, admittedly for a little longer than the two days I managed and those were tiring enough! The tutors were very good and it was also good to meet more fellow Welsh learners. The primary feeling at the end of the two days however was exasperation that I hadn't had the opportunity to learn Welsh in school when it would have been much, much easier! At least this angle seems to be covered in Wales now. Saying this, I would probably have been one of those individuals who 'lost' the language after school and had to come back to study again anyway when their interest was renewed, as was the case with some of my course peers.

Anway, below are a few resources I find/ have found useful. I wanted to tie these together for my own reference and as we seem to be lacking any decent portal for Welsh learners centralising such information:

  • Cysill Ar-lein (http://www.cysgliad.com/cysill/arlein/)  - I've only had a quick play but looks great for the experienced learner or fluent speaker as it allows you to check whole chunks of text. This from the Language Technologies Unit of Canolfan Bedwyr - Bangor University's centre for Welsh language services, which seems to be the dominant such establishment in Wales. There is also an offline application for purchase (Cysgliad), though there is no indication of price on the website/ no ability to order online. A quick google indicated a price of £40 for Windows and that the Mac version was available for free(!).
  • Porth Termau/ Terminology Portal (http://ap.termau.org/) which provides a simple web interface to the underlying databases
  • The BBC Welsh dictionary (http://www.bbc.co.uk/wales/welshdictionary/en-cy/) which again is a web view on the Bangor database(s). Also see their other resources, e.g. Pigion and Catchphrase. I may well return to more general resources like these, Hwb, SSIW, Yr Bont, etc. in another post. 03/02/13 (thanks to Esyllt): also available via the dictionary pages the BBC's online verb conjugator.
  • Google translate (http://translate.google.co.uk) - which I use often and seems pretty good, albeit I'm only Canolradd so my judgement might be questionable. It has some nice tools not available on other solutions. It's also embedded into Google's Chrome browser - it will recognise the language and offer to translate whole pages should you use this browser. There are a number of other web browser plugin tools which I may return to at a later date, though I don't tend to use them.
  • Geriadur.net (http://www.geiriadur.net) - the dictionary from Trinity Saint David. I haven't used it much but know of others who prefer it.
  • 'Ap Geiriaduron' (http://www.bangor.ac.uk/canolfanbedwyr/ap_geiriaduron.php.en) All the above are online resources ... no good in the Kymin in Penarth during my lessons as connectivity is non-existent. For offline apps best served are Android and Apple users with 'Ap Geiriaduron'. Hopefully this will make it to other platforms as well in the not too distant future.
  • Geiriadur yr Academi 03/02/13 (thanks to Esyllt): another from Canolfan Bedwyr, though not listed in the Language Technologies Unit Websites page.

In summary, Canolfan Bedwyr seems to be dominating the market! Personally I used to use the BBC view on the database(s) but after buying a Nexus 7 this has largely been replaced with 'Ap Geiriaduron' and I also frequently use Google Translate. I should also, I think, give Cysill Ar-lein more of a go in terms of a learning aid.

Digon ar hyn o bryd. I may return with a list of some more general resources at a later date. Feel free to suggest some and I will collate. Similarly if I have missed any dictionary resources you like, please leave a comment via the below. Or if you know of any good web portals for Welsh learners as I've struggled to find one. So maybe I'll make one.

Chris.

Additional:

21/01/13: Glosbe - the multilingual online dictionary (http://glosbe.com/); also I've been using Google translate and it's not quite as good as I thought - gets a bit confused with the more complex ...

17/04/13: Gweiadur - I haven't delved into much and there a few 'inconsistencies' in this beta site, but looks to be an interesting project.

17/06/13: Just spotted Eurfa which also includes a downloadable dictionary.

 

Comment: A Microsoft/ Skype Fiasco (no decent Skype client for Windows Phone 7.8)

30. December 2012 16:17 by Chris in Windows Phone  //  Tags: ,   //   Comments (2)

Sometimes, more frequently than shoulf eb the case, it seems like Microsoft's left hand doesn't know what their right hand is doing, or an alternative idiom might be " it's arse from it's elbow".

So Microsoft buys Skype. It starts improving the integration with it's other products and service. I've never been a big Skype user but in part prompted by Microsoft's purchase I think it might be a good idea to get a Skype number for business calls. Rather than use my old, personal account I spot the fact that I can now sign in with (one of) my Microsoft logins - "Windows Live" I think is the current vernacular if marketing haven't changed it since last I looked. Great. This seems sensible - this should give me the option to pull in contcats from elsewhere ... should save some time and effort. The UI offers me the option to merge with an existing account - I don't go for this option as the UI doesn;t explain what exactly this means, and anyway surely I would be abel to perform a merge after regustration should I so choose?!

All good - I register, I buy my skype number. Side bar: it was impossible to find out how much this was goign to cost without going through the purchase process. Seems an OK price and there is a monthly option so I can see how it goes. All setup in a few minutes and I test the number and the allied voicemail. All good so I change the contact details on my email sig and website. Quite happy.

Oh dear though ... I go to my Nokia Lumia 800 to change the skype account to match the new one so that I may receive skype calls to my mobile and it doesn't accept the microsoft login or the accompanying autogenerated skype login. I must be doing something wrong - surely Microsoft wouldn't have cocked up like this? A quick google and yes, it seems this was a deliberate decision. I check for updates to the skype client - there haven't been any for months. I can't find any roadmap for OS/ device releases. I try the skype support twitter account - there is no response in 24hrs. So I fire off a support email. I keep it brief;):

There is no option on windows phone 7 skype client to login with a Microsoft account. Solution?

I am pleasantly surprised to receive a response within an hour or so. The response itself I am less happy with:

We understand that you wish to sign in to Skype using a Microsoft account on Windows 7. We'd be glad to look into this for you.

Unfortunately this feature is not currently available in Skype. 
 
We will pass on your request to our development team for consideration and potential inclusion in a future release.

Should any further issues arise, please feel free to contact us again.

Hmmm, Windows 7? Solution? Next email:

An entirely unsatisfactory response! It’s Windows Phone 7  and the question is outstanding – what is the recommended workaround solution to enable skype usage on win phone 7? E.g. do I need to create a new skype account and merge with the Microsoft account – will this work?

Skype support response (from different support person, continued quick response):

We understand that you want to sign in to Skype on your Windows 7 phone using your Microsoft account. We know how important this is for you, and we would like to inform you further about this issue.

To view your Microsoft contacts, you need to sign in using your Microsoft email address and password on Skype. Unfortunately, the option to do this is not yet available on your current device. Even if you merged your Microsoft account to a Skype account, you will still be required to sign in using Microsoft credentials to be able to view your Microsoft contacts.

Please accept our sincerest apologies on this matter, and we thank you for your patience and understanding.

 Which isn't actually my main issue. Next email:

Thanks for the rapid responses.

I don’t particularly mind if I don’t see my Microsoft related contacts – I can presumably set them up separately if I need to. What I would like to be able to do is receive Skype calls to my mobile via the Skype number I bought yesterday but which is currently associated with the Microsoft based account I also setup yesterday. I have a separate skype account ‘olops2000’ I have previously used but I created a new Microsoft linked account yesterday and did not merge accounts at the time as I was not made aware of the Window Phone limitation. Can the olops2000 account now me merged with the Microsoft account, for example, so that I may receive calls as above? If you can supply me a brief step by step as to how I currently can achieve this goal ether way that would be great.

Thanks in advance.

 Skype support's response:

We understand that you want to sign in using your Microsoft account so you can make use of the Skype Number that you have bought under your account. Please allow us to assist you with this issue.

Please be informed that your Microsoft account is currently merged to an automatically generated Skype ID live:chris.sully, which was created when you signed up for Skype using Microsoft credentials. You cannot use this Skype Name to log in nor can you reset the password for it. To access this account, you will need to use your Microsoft email address and password.

When we unmerge your accounts so that you may merge to an existing Skype account, please note that all purchases on the live:chris.sully account will be lost. Also, it is not possible to transfer purchases from one account to another. What we can recommend is that you use another device that supports logging in using a Microsoft account so that you may use the Skype Number that you have bought.

We hope you find this answer helpful. Should you need any further assistance or have additional questions, please do not hesitate to contact us again.

My turn:

Thanks for the clarity even though the situation is far from ideal.

Re: What we can recommend is that you use another device that supports logging in using a Microsoft account so that you may use the Skype Number that you have bought.

Is there a list of operating systems/ devices that support such functionality anywhere, i.e. a web page? Although given Microsoft now owns Skype I would hope/ expect the functionality would come to my device imminently? Would you be able to confirm if the functionality is planned for the next release of the Skype client for Windows phone 7.X (7.8 perhaps?) and/ or whether the functionality exists/ is planned for the Windows Phone 8 client? Also when these releases are scheduled for/ is there any ‘roadmap’  information available anywhere?

Thanks in advance.

Skype's turn (different person again):

Thank you for your reply.

We understand your concern that you want to know if you'll be able to log your Microsoft email in Windows phone 7.8 or 8. We'll be more than glad to further assist you.

Yes, you may use your Microsoft credentials on Windows phone 8. We're sorry that it's not available for version 7.8

Should any further issues arise, please feel free to contact us again.

So we get there in the end: another dead end for Windows Phone 7.X, even though Microsoft is now in charge! Rubbish!

 

Apps, Operating Systems and Devices

6. December 2012 16:00 by Chris in dev  //  Tags: , , ,   //   Comments (5)

The ratios of devices accessing content over the Internet has changed significantly over recent years. The chiefly impacting form factor here is the smartphone, with Apple driving matters with the iPhone and Android taking over in market share terms, and there are other players who will continue to attempt to challenge the current market dominance of the ‘big 2’ with perhaps the best bet being Microsoft though, admittedly, they have failed to make any significant impact with Windows Phone 7.X. This may change if Microsoft manage some decent and cross pollinating marketing of Windows 8, Windows RT and Windows Phone 8. I’m not holding my breath though. Hands up who knows the difference between WinRT and Windows RT, for example?

Anyway, each phone operating system and surrounding ecosystem has its strengths and weaknesses and I won’t enter into related discussions here. What I shall consider is the messy situation we have with ‘app’ development. The term ‘app’ has entered common parlance though I’m unsure what the shared understanding of the term actually is. Certainly this has been driven into the collective consciousness by Apple’s ‘AppStore’, and subsequently by the misleadingly named ‘Google Play’. Therefore ‘app’ refers to applications that are designed to be run on mobile devices – initially smartphones and then more recently on tablet devices such the iPad and the Nexus 7? Microsoft has jumped on the bandwagon with its similar Windows Phone Marketplace and, most recently, the Windows Store.

But ‘app’ just means application doesn’t it? So here each operating system has its own app store which delivers applications designed to be run on that operating system including, and this is key, a User eXperience consistence with the design values pertinent to the target device. Thus developers/ organisations, as part of their business model, may choose to target an individual platform for their application. If they have the right app each of the major platforms offers a significant market and this approach can work. The problem then comes if they wish to extend their market to other platforms – currently for the very best user experience they will need to develop that application in a quite different set of technologies which means that porting apps from one platform to another is expensive (I note that there are cross platform tools out there but, as far as I am aware, they remain largely unproven – see below).

Now switch to an alternate scenario of an organisation which is not targeting a platform but targeting an existing customer base and hence will find themselves in the situation of prioritising development of their apps for differing platforms. Take the example of a bank who wishes to produce an app for customer to perform account management. Why? Well for competitive advantage of course – to keep existing customers with them and to encourage new customers to them. They will need to produce and maintain multiple versions of the application for the different platforms: 2,3,4? Logically they would then continue to prioritise platforms based on the breakdown of their current/ targeted user base.

So, firstly is this situation any different from that with more traditional devices – Apple or Microsoft OS based desktop or laptop devices. Yes because the mobile nature of devices opens up so many more useful app scenarios and the app store concept has taken off. No in that as we had, and have, OS specific traditional client computing ‘apps’ – the solution was moving the applications to the web and to related cross platform technologies.

So there are two, related solutions to this problem area which is only going to get worse as the market further fractures with device form factors and operating systems:

  1. rather than having client applications specifically developed for each mobile OS let’s write them in HTML5 and related technologies. There has already been a bug push in the last 3years+ to push more and more functionality down to the client as devices became more powerful and this path offered more scalability than each client devices using significant computing resources. A caveat here – mobile devices offer significantly less computing resources than your desktop client, though this is changing quickly. Technology has a habit of doing this …
  2. rather than having client applications specifically developed for each mobile OS let’s write them in a generic fashion and rely more on using tools and technology to ‘translate’ these apps to work in a variety of client devices.

Or, probably, a bit of both. The downside? Well, there is device specific knowledge and trickery to ensuring optimal user experiences (particularly) in apps. Will the user experience be good enough for end users via cross-platform development solutions? I hope so. The current situation can’t be sustainable, can it?

Chris Sully
Technical Director, Propona

[first published: https://connect.innovateuk.org/web/propona/blog/-/blogs/apps-operating-systems-and-devices?ns_33_redirect=%2Fweb%2Fpropona%2Fblog]

 

Allowing for User Mischief - A Bad Word Filter

12. April 2003 15:39 by Chris in   //  Tags:   //   Comments (1)

Note that this article was first published on 02/01/2003. The original article is available on DotNetJohn, where the code is also available for download

Introduction

This article presents an improved core implementation of a solution to a particular problem I come across occasionally: detection and/ or removal of suspect words from user supplied text on web sites. A typical application scenario might be a discussion forum. For example, I’ve worked on a few sports related web sites where discussions can become …heated and the language used occasionally strays into that inappropriate for the general audience of the site. There are several approaches to dealing with this problem, some of which are discussed in my previous articles on this subject published on ASPAlliance ( http://www.aspalliance.com/sullyc/articles/user_mischief.aspx - no longer available) and 15Seconds ( http://www.15seconds.com/issue/030121.htm - no longer available). The latter article looks at a composite control based implementation by the way. As indicated in these articles I suggest the best way would be identification and removal of any suspect words.

A Starting Point

My previous implementation was based on a word and/ or word fragment ('word roots') list defined with an XML document. The items from this list were then compared against the user-inputted text string and matches highlighted using the string manipulation functionality available in .NET. For re-usability I decided on a user control. See the first article ( http://www.aspalliance.com/sullyc/articles/user_mischief.aspx ) for full details but the core of the implementation is reading the XML into a local data structure for subsequent direct comparison, in this case an ArrayList:

 <%@Control Language="VB"%>
 <%@ Import Namespace="system.Xml" %>
 
 <script language="VB" runat="server">
 
 Dim alWordList As new ArrayList
 
 Sub Page_Load()
   dim xmlDocPath as string = server.mappath("bad_words.xml")
   dim xmlReader as XmlTextreader = new xmlTextReader(xmlDocPath)
   while (xmlReader.Read())
     if xmlReader.NodeType=xmlNodeType.Text then
       alWordList.Add(xmlReader.Value)
       trace.write("Added: " & xmlReader.Value)
     end if
   end while
   xmlReader.Close()
 End Sub
 
 Public Function CheckString(InputString as String) as string
   dim element as string
   dim output as string
   trace.write("Checking " & InputString)
   For Each element in alWordList
     trace.write("Checking: " & element)
     InputString=InputString.Replace(element,"****")
   Next
   trace.write("Returning " & InputString)
   Return InputString
 End Function
 
 </script> 

with the XML file being of the format:

 <?xml version="1.0"?>
 <words>
   <word>word root 1</word>
   <word>word root 2</word>
 </words> 

With the actual words replaced to protect the innocent.

Then all that remains is capturing of user text, via a textbox perhaps, registering the user control for use in the page:

 <anti_swear:cleanup id="cleanup1" runat="server" /> 

and using the control to check the inputted text for ‘naughty’ word roots:

 dim clean_text as string
 
   clean_text=tbMessage.text ‘text to be checked
   trace.write("message text: " & clean_text)
 
   clean_text=cleanup1.CheckString(clean_text)
   trace.write("message text (cleaned): " & clean_text)
 
   if clean_text<>tbMessage.text then
     trace.write("Text not clean!")
     tbMessage.text=clean_text
     lblInfo.Text="Naughty words found ... please remove!"
   else
     'all is OK … submit to db/ other permanent store for later recall 

So, in this simple implementation CheckString returns a string with naughty word roots replaced by ‘****’ and we can detect if such words have been found as the returned text will be different from that passed into the function.

The actual detection is very simple:

 InputString=InputString.Replace(element,"****") 

too simple in fact as we’ll shortly explore.

Note that in the XML document I’ve used the phrase ‘word root 1’: important as it is only the roots of suspect vocabulary that you need to place in the XML document, thus reducing the effort involved for you. This should limit the number of XML elements you need to introduce to cover the commonly used expletives but also means care must be taken not exclude perfectly acceptable words.

The Problem

What’s the problem? Well, you may well have already realised that as pointed out to me by my fellow ASPAlliance columnist Jonathan Cogley (and as alluded to in the last paragraph of the last section):

Your approach uses a regular text replace which could create a new problem. Since it will identify the offending sequence of letters in perfectly harmless words e.g. scunthorpe would be rendered as s****horpe.

Jonathan suggested two possible solutions, with my thoughts on implementation also below:

  • a white word list – a list of permissible words to be checked if a word on the black word list was found. This approach I believe to be too prone to programmer 'error' – there are too many language combinations to provide a sleek solution.
  • regular expressions – the powerful language of regular expressions should be able to provide a better matching algorithm that would alleviate the problem.

The Solution

Let’s consider and see if we can find a better solution. An obvious starting point is the example exception above and thus the regular expression concept of word boundaries.

Scunthorpe (a town in the UK for our international readers and possible towns elsewhere in the world for all I know).

As we’re interested in roots of words we’d prefer that Scunthorpe not match because it starts with an S and hence shouldn’t be offensive to anyone. However we are interested in matching any derivatives of our dubious root words so whilst we want to specify the beginning word boundary we shouldn’t be interested in the ending word boundary.

In regular expressions word boundaries are identified via the concept of an anchor. Anchors specify the position where the pattern occurs. For example:

^ Matches at the start of a line.
$ Matches at the end of a line.
\< Matches at the beginning of a word.
\> Matches at the end of a word.
\b Matches at the beginning or the end of a word.
\B Matches any character not at the beginning or end of a word.

Thus the above include a few options we’re interested in. Let’s use \b, the word boundary anchor. This represents anything that can come before or after a word, e.g. white space, punctuation and/or the beginning or end of a line.

So we want to engage in a regular expression search / replace for '\broot word'. This should solve our problem. How do we do this in .NET?

Regular Expression Solution in .NET

We’re going to focus on solving this little problem and shall not be considering the range of extensive support for Regular Expressions in .NET. However, look out for such an article on dotnetjohn in the near future.

There are a variety of supporting classes we could use:

Regex: the Regex class represents a regular expression. It also contains static methods that allow use of other regular expression classes without explicitly instantiating objects of the other classes.

Match: the Match class represents the results of a regular expression matching operation.

MatchCollection: the MatchCollection class represents a sequence of successful non-overlapping matches.

An example of how we might utilize the Regex class is:

 Dim r As Regex = New Regex("\b" & “NaughtyRoot”)  
 

Further, among the members of the Regex class are:

IsMatch - indicates whether the regular expression finds a match in the input string.

Match - searches an input string for an occurrence of a regular expression and returns the precise result as a single Match object.

Matches - searches an input string for all occurrences of a regular expression and returns all the successful matches as if Match were called numerous times.

Replace - replaces all occurrences of a character pattern defined by a regular expression with a specified replacement character string.

In line with our previous implementation we would use the Replace function, replacing our CheckString function with:

 Public Function CheckString(InputString as String) as string
   Dim r As Regex
   dim element as string
   dim output as string
   trace.write("Checking " & InputString)
   For Each element in alWordList
     r = New Regex("\b" & element)
     trace.write("Checking: " & element)
     InputString=r.Replace(InputString,"****")
   Next
   trace.write("Returning " & InputString)
   Return InputString
 End Function 

Which does indeed do what we wish. One caveat is that as we are only checking for the beginning of words some swear words may slip through the net if we don’t explicitly add them to the bad words list. ‘Motherf**ker’ is an example. I can’t think of an easy way around this problem however. You could extend the solution to include end of word boundaries but then you need to include ‘f**ker’ as well as ‘f**k’, for example. Plus, you increase the risk of trapping valid words.

Note also that the provided solution is not perfect on the grounds that some valid words will no doubt still be challenged by this solution. I do believe it is a good compromise, however. It might be a good option to change the language of the interface to indicate the presence of 'possibly suspect words' and to let the user edit the text. It should be obvious to the user why their text has been returned to them.

Conclusion

I hope this article has provided a useful extension to my earlier articles on the subject and in doing so introduced some readers to the powerful language provided by regular expressions. If you’d like to raise any points about this article, in particular thoughts on how the solution could be improved, email me (sullyc-olops@btinternet.com ).

The Zipfile

The zipfile includes the following:

markII.aspx web form page with text box and calling user control methods.

user_controls
/anti_swear.ascx string based version
/anti_swear2.ascx regular expression based version
/bad_words.xml

To use, populate bad_words.xml and alter the user control reference in markII.aspx to see the differences between the versions.

You may download the code here.

XSD Schemas: An Introduction

31. March 2003 15:35 by Chris in dev  //  Tags:   //   Comments (0)

Note that this article was first published on 02/01/2003. The original article is available on DotNetJohn.

Introduction

The XML Schema definition language (XSD) enables you to define the structure (elements and attributes) and data types for XML documents. It enables this in a way that conforms to the relevant W3C recommendations for XML schema. XSD is just one of several XML schema definition languages but is the one best supported by Microsoft in .NET.

The schema specifies the ordering of tags in the document, indicates fields that are mandatory or that may occur different numbers of times, gives the datatypes of fields and so on. The schema importantly is able to ensure that data values in the XML file are valid as far as the parent application is concerned.

Schemas are also useful when developers in different companies or even in different parts of the same company read and write XML documents that they will share. The schema acts as a contract specifying exactly what one application or part of an application must write into an XML file and another program can expect to be there. The schema unambiguously states the correct format for the shared XML.

A well formed XML document is one that satisfies the usual rules of XML. For example, in a well formed document there is exactly one data root node, all opening tags have corresponding closing tags, tag names do not contain spaces, the names in opening and closing tags are spelt in exactly the same way, tags are properly nested, etc.

A valid document is one that is well formed and that satisfies a schema.

Visual Basic .NET provide several methods for validating an XML document against a schema. There are articles on how to do this already on dotnetjohn ( Upload an XML File and Validate Against a Schema ). The focus of this article however shall be on the basic elements of the Microsoft preferred schema language (XSD), after a brief history lesson / an introduction to other common types of schema you may come across and why the XSD alternative was developed.

DTD and XDR

While XML is a relatively new technology the need for schemas was recognised early and so several have already been created. Microsoft focuses heavily on the most recent version, XSD, so VB has the most support for this form of schema, and hence will probably be the one Microsoft developers use most.

However, you may well happen upon the situation, particularly with enterprise development, where you are required to work with other forms of XML schema. While VB has few tools for building other types of schema, it can validate data using DTD and XDR.

The first schema standard was developed alongside XML v1.0 and is DTD (Document Type Definition) schemas. This, many believed, was not an ideal solution as a schema definition language which is why Microsoft came up with XSD as its own suggested replacement and submitted this to the W3C for consideration. One of the problems was, and is, that DTDs are not XML based so you have yet another language to learn to go with the proliferation that comes with XML (XPath and XSL for example). Further, developers also found that DTD lacked the power and flexibility they needed to completely define all of the datatypes they wanted to represent in XML. A schema that can’t validate all of the data’s requirements is of limited use.

XDR (XML Data Reduced) is another schema language, this time XML based and providing a superset of the functionality of DTDs. XDR should not be confused with Sun’s XDR (External Data Representation) … another format for data description but in this case physical representation of data rather than logical representation as per XML and XML Data Reduced schemas.

The last few paragraphs were just to let you know there are other schema formats out there, some of which have limited support in .NET. Now we’ll focus on XSD.

XSD

Wherever you see 'schema' from now we’re referring to XSD. As per many topics relating to XML (see my article on XSL Understanding How to Use XSL Transforms) the XSD specification is complex as well as being quickly evolving. The following will cover the basics of XSD so you can start to construct some useful schemas for use in your own applications. You’ll then need to follow up the information presented elsewhere.

Note that Visual Studio .NET includes an XSD editor that makes generating schemas relatively painless. Unless you understand some of the basic rules of XSD, however, the editor may prove a tad confusing.

Types and Elements

XSD schemas contain type definitions and elements. A type definition defines an allowed XML data type. An 'address' might be an example of a type you might want to define. An element represents an item created in the XML file. If the XML file contains an Address tag, then the XSD file will contain a corresponding element named Address. The data type of the Address element indicates the type of data allowed in the XML file’s Address tag.

Type definitions may be simple or complex. Simple and complex types allow definition of the new data types in addition to the 19 built in primitive data types which include string, Boolean, decimal, date, etc.

A simpleType allows a type definition for a value that can be used as the content of an element or attribute. This data type cannot contain elements or have attributes.

A complexType allows a type definition for elements that can contain attributes and elements.

Let’s pause here and take a look at an example. Let’s work backwards from an XML document as I’ll assume we’re all reasonably familiar with XML but less so with XSD. Here’s an XML file representing a simplified contacts database containing just one record currently:

 <?xml version="1.0" encoding="utf-8" ?>
 <Contacts>
   <Contact>
     <FirstName>Chris</FirstName>
     <Surname>Sully</Surname>
     <Address>
       <Street>22 Denton Road</Street>
       <City>Cardiff</City>
       <Country>Wales</Country>
     </Address>
     <Tel>02920371877</Tel>
   </Contact>
 </Contacts> 

In fact in Visual Studio .NET you can simply right click on this XML file and generate the schema from it. Of course it may well not be quite correct for your needs, as it shall be based on one record of data. Not even Visual Studio .NET can predict the future with any accuracy … ;) Here’s what it comes up with:

 <?xml version="1.0" ?>
 <xs:schema id="Contacts" targetNamespace="http://tempuri.org/XMLFile1.xsd" xmlns:mstns="http://tempuri.org/XMLFile1.xsd" xmlns="http://tempuri.org/XMLFile1.xsd" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:msdata="urn:schemas-microsoft-com:xml-msdata" attributeFormDefault="qualified" elementFormDefault="qualified">
   <xs:element name="Contacts" msdata:IsDataSet="true" msdata:Locale="en-GB" msdata:EnforceConstraints="False">
     <xs:complexType>
       <xs:choice maxOccurs="unbounded">
         <xs:element name="Contact">
           <xs:complexType>
             <xs:sequence>
               <xs:element name="FirstName" type="xs:string" minOccurs="0" />
               <xs:element name="Surname" type="xs:string" minOccurs="0" />
               <xs:element name="Tel" type="xs:string" minOccurs="0" />
               <xs:element name="Address" minOccurs="0" maxOccurs="unbounded">
                 <xs:complexType>
                   <xs:sequence>
                     <xs:element name="Street" type="xs:string" minOccurs="0" />
                     <xs:element name="City" type="xs:string" minOccurs="0" />
                     <xs:element name="Country" type="xs:string" minOccurs="0" />
                   </xs:sequence>
                 </xs:complexType>
               </xs:element>
             </xs:sequence>
           </xs:complexType>
         </xs:element>
       </xs:choice>
     </xs:complexType>
   </xs:element>
 </xs:schema> 

Picking out some key elements:

An XML Schema is composed of the top-level schema element. The schema element definition must include the following namespace:

http://www.w3.org/2001/XMLSchema

and you can see from the above this isn’t all that is generated, but we’ll ignore the extra elements for now.

The actual definition commences with the first <xs:element… definition which has the name attribute 'Contacts'. Again the other attributes we can ignore for now. Contacts is necessarily defined as a complex type as it contains other elements.

We then encounter <xs:choice…: a choice element allows the XML file to contain one of the elements inside the choice element. The attribute maxOccurs="unbounded" is used to indicate that the Contacts element can contain any number of Contact elements.

The contact element is again a complex type comprised of a sequence of further elements. A sequence element requires the XML document to contain the items inside the sequence in order. By default sequence elements must appear exactly once; this can be overridden using the minOccurs and maxOccurs attributes to indicate it can occur any number of times (including 0).

The individual elements are defined to be of type string (a simple type). Address is similar defined as a complex type of a sequence of elements of type string.

Hopefully that has been an informative introduction to some commonly encountered constructs by way of an example. We’ll now continue on to look at some of the XSD language constructs in a little more detail, starting with elements.

Elements and their attributes

e.g. <xs:element name="Street" type="xs:string" minOccurs="0" />

An element defines an entity in an XML file. So the above defines an element of name <Street> and type string. The element can have several attributes which modify the elements behaviour, for example:

minOccurs and maxOccurs: as indicated already these give the minimum and maximum allowed number of times an element can occur within a complex type. To make an element optional minOccurs is set to 0. To allow an unlimited number of the element maxOccurs is set to 'unbounded'.

ref: makes the element a copy of another element defined in the schema. This is best avoided however... it is better to define a distinct type and base both element definitions on this type rather than introduce such dependencies into the schema, e.g.

 <xsd:simpleType name=”PhoneNumberType>
   <xsd:restriction base=”xsd:string/>
 </xsd:simpleType>
 ...
 <xsd:complexType name=”Contact>
   <xsd:sequence>
   ...
     <xsd:element name=”HomeTeltype=”PhoneNumberType/>
     <xsd:element name=”WorkTeltype=”PhoneNumberType/>
   ...
   <xsd:sequence>
 </xsd:complexType> 

Though you might at the same time like to tie down your definition of the PhoneNumberType more tightly. We’ll return to <xsd:restriction … shortly.

default: assigns a default value to the element in which case if the XML document omits the corresponding field it will be assumed to have this value. An element that has a default value should also have minOccurs set to 0 so the XML document may omit it.

fixed: gives the element an unchangeable value. The corresponding XML element cannot have another value, although it may be omitted if minOccurs is 0. Why is this useful? Well, you may want to ensure that an XML data field has the same value throughout the document; for example, you may want to add a new Country field to an existing XML document and ensure that its value is UK for every record.

Types

Type definitions have two goals:

  1. To describe the data allowed in a simple field, e.g. text format of an e-mail address. Simple types achieve this goal.
  2. To describe relationships amongst different fields, e.g. a contact type consists of a sequence of firstname, surname, telephone, etc. Complex types achieve this goal of designing more complex data types.

In addition to simple and complex types there are built in types, similar to simple data types such as integers, dates, etc. in other programming languages or .NET’s value data types provided by the Common Type System (CTS). We’ve seen one of the built in types already in our element definitions in the form of the often-employed string type. These built in types are W3C defined and include date, dateTime, decimal, double, float, Year, etc. etc. See the SDK documentation for an authoritative list.

A facet is a characteristic of a data type that you can use to restrict the values allowed by a type. Facets are effectively attributes of the data type. For example, the string datatype has a maxLength facet. Again for further details of the facets of each built in type see the SDK documentation.

Facets enable short cuts to building simple types by restricting another data type. We’ve already seen the restriction construct in example code above; using this and the enumeration facet of the string built in data type we can define allowable values for a type, e.g.

 <xsd:simpleType name=”Colours>
   <xsd:restriction base=”xsd:string>
     <xsd:enumeration value=”red/>
     <xsd:enumeration value=”green/>
     <xsd:enumeration value=”blue/>
   </xsd:restriction>
 </xsd:simpleType> 

The pattern facet is particularly powerful as it specifies a regular expression that the XML field data must match. Regular Expressions are worthy of an article or three in themselves, and there are several books on the subject if interested in improving your knowledge. Look out for an article on regular expressions on dotnetjohn in the not too distant future! For now, we’ll largely skip over the topic of regular expressions though here is an example:

 <xsd:simpleType name=”emailType>
   <xsd:restriction base=”xsd:string>
     <xsd:pattern value=”[^@]+@[^@]+\.[^@]+” />
   <xsd:restriction>
 <xsd:simpleType /> 

Let’s decipher ”[^@]+@[^@]+\.[^@]+” – this matches an e-mail address of the form a@b.c where a,b and c are any strings that do not contain the @ symbol. The value string equates to 'match any character other than the @ symbol one or more times; then match an @ symbol; then again match any character other than the @ symbol one or more times; next match a full stop and then once more any character other than the @ symbol one or more times'.

The use of the length, minLength, maxLength, totalDigits, fractionDigits, minExclusive, maxExclsuive, minInclusive, maxInclusive facets are all self-describing but it’s important to know they are available.

In addition to the primitive built in types there exist built in data types derived from these primitive types. These derived built in data types refine the definition of primitive types to create more restrictive types. They are based on the string and decimal primitive types.

The string derived types represent various entities that occur in XML syntax itself. For example, the Name type represents a string that satisfies the form of XML token names – it begins with a letter, underscore or colon and the rest of the string contains letters and digits.

The decimal derived types represent various kinds of numbers and thus are considerably more useful for validating data. There are thirteen such decimal derived types, e.g. byte, int, negativeInteger. See the SDK documentation for the full list.

Attributes

Just as you use an XSD schema’s element entities to define the data that can be contained in the corresponding XML data elements you can use attribute entities to define the attributes the XML element can have. Let’s return to Visual Studio .Net and see what schema it comes up with for the following small attribute-centric piece of XML.

 <contacts>
   <contact firstname=”ChrisSurname=”Sully/>
 </contacts> 

 

 <?xml version="1.0" ?>
 <xs:schema id="contacts" targetNamespace="http://tempuri.org/attribute_centric.xsd" xmlns:mstns="http://tempuri.org/attribute_centric.xsd" xmlns="http://tempuri.org/attribute_centric.xsd" xmlns:xs="http://www.w3.org/2001/XMLSchema" xmlns:msdata="urn:schemas-microsoft-com:xml-msdata" attributeFormDefault="qualified" elementFormDefault="qualified">
   <xs:element name="contacts" msdata:IsDataSet="true" msdata:Locale="en-GB" msdata:EnforceConstraints="False">
     <xs:complexType>
       <xs:choice maxOccurs="unbounded">
         <xs:element name="contact">
           <xs:complexType>
             <xs:attribute name="firstname" form="unqualified" type="xs:string" />
             <xs:attribute name="Surname" form="unqualified" type="xs:string" />
           </xs:complexType>
         </xs:element>
       </xs:choice>
     </xs:complexType>
   </xs:element>
 </xs:schema> 

You can see that attributes equate to the

<xs:attribute name="firstname" form="unqualified" type="xs:string" />

construct. The form attribute of the attribute tag (yes that does make sense!) is set to unqualified. This means that the attributes in the XML file do not need to be qualified by the schema’s namespace.

Why use attributes rather than elements (referred to as attribute-centric and element-centric XML)? Well, they are often interchangeable and it is largely a matter of taste. Generally however, elements should contain data and attributes should contain information that describes the data. So for contacts one could recommend an attribute centric approach.

However, such decisions are mitigated by the following:

  • the attribute centric approach consumes less file space
  • attributes can specify default values whereas elements generally do not
  • you can order elements via the sequence construct; there is no method of enforcing order with attributes
  • elements can occur more than once in complex types but an attribute can occur only once

Complex Types

As previously stated, whereas a simple type determines the type of data a simple text field can hold, a complex type defines relationships among other types. For example we defined a contact record to include fields to store first name and surname. Simple types are then used to define the allowable values in the fields. The complex type determines the fields that make up the contact type.

Complex types are also useful for defining XML elements that can have attributes – simple types cannot have attributes. A complex type can contain only one of a small number of elements. The elements within that element define the relationship the complex type represents. The most common elements are simpleContent, sequence, choice and all, as follows:

simpleContent: a complex type that contains a simpleContent element must contain only character data or a simple type. This construct is primarily so one may add attributes to a simple type.

sequence: as we’ve seen this allows specification of a required order to elements of a complexType.

choice: again as we’ve seen the corresponding XML data must include exactly one of the elements listed inside the choice. Note this is entirely different from the enumeration facet previously introduced: rather than a fixed set of values the choice construct allows the complex type to contain one of several types.

all: when a complex type includes the all element the corresponding XML data can include some or all of the listed elements in any order.

Named and Unnamed Types

Finally, to finish off our overview: if you will use a type only once there is no need to give it a name as you will not need to reference it again. You may choose to include the type definition in the code which uses it. This is the case for the Visual Studio generated schemas above, e.g.:

 <xs:complexType>
   <xs:choice maxOccurs="unbounded">
     <xs:element name="contact">
       <xs:complexType>
         <xs:attribute name="firstname" form="unqualified" type="xs:string" />
         <xs:attribute name="Surname" form="unqualified" type="xs:string" />
       </xs:complexType>
     </xs:element>
   </xs:choice>
 </xs:complexType> 

If the schema referenced this complexType again it would be more succinct to add a name attribute to the <xs:complexType … definition so you could reference it again later in the schema definition.

To clarify by example, the following uses the email simple type we introduced earlier to reduce the size of an ‘EmailContactType’ definition:

 <xsd:simpleType name=”emailType>
   <xsd:restriction base=”xsd:string>
     <xsd:pattern value=”[^@]+@[^@]+\.[^@]+” />
   <xsd:restriction>
 <xsd:simpleType />
 
 <xsd:complexType name=”emailContactType>
   <xsd:sequence>
     <xsd:element name=”nametype=”xsd:string/>
     <xsd:element name=”emailtype=”emailType/>
   </xsd:sequence>
 </xsd:complexType> 

Alternatively you could have defined the email type within the email element. Whether reusing or not, employing this convention will generally make your code tidier.

Conclusion

There we shall halt our introduction to XML Schemas and to the basic XSD constructs specifically and hope you are better placed to understand why and how to use XML schemas in .NET

References

.NET Framework SDK documentation

Visual Basic .Net and XML
Stephens and Hochgurtel

Programming Visual Basic .NET
Francesco Balena
Microsoft Press

About the author

I am Dr Christopher Sully (MCPD, MCSD) and I am a Cardiff, UK based IT Consultant/ Developer and have been involved in the industry since 1996 though I started programming considerably earlier than that. During the intervening period I've worked mainly on web application projects utilising Microsoft products and technologies: principally ASP.NET and SQL Server and working on all phases of the project lifecycle. If you might like to utilise some of the aforementioned experience I would strongly recommend that you contact me. I am also trying to improve my Welsh so am likely to blog about this as well as IT matters.

Month List