Monday 28 November 2011

Automattic crack down on Plugin Developers for Wordpress

Wordpress crack down on plugin developers wanting to Earn Money from their hard work

I have had to remove the PluginSponsors.com Wordpress plugin from all my Wordpress plugins due to Wordpress threatening the maker of the PluginSponsors.com plugin that they will remove any plugins that utilise his harmless advert with removal from the wordpress repository.

It's not as if Wordpress Plugin developers make much money from their hard work in the first place and people who use them seem to expect 24 hour support for free as well.

Rarely I might get a small donation from one of my plugin users but it is hardly enough to cover the expense of running a cloud based server so in my opinion Wordpress is bang out of order.

Not only have I had numerous blogs on wordpress.com removed without explanation (one I have had returned without any reasoning behind the ban in the first place) but when my first wordpress.com blog was banned the other year I found out just how unreasonable the people at Automattic can be from my email discussions with them.

Not only would refuse to offer proper explanations for their reasoning behind the ban even when I gave them logical counter arguments to anything they mentioned but their intransigence reminded me of a stubborn child demanding more sweeties and refusing to back down until they got them.

Then when logic and reason was too much for them they just ignored my emails leaving me with a banned site that was up until then receiving high amounts of traffic.

Today if I was recommending a free hosted blogging tool to anyone I would tell them to go and check out blogspot with Google.

I know this blog is a blogspot blog and doesn't exactly look much but it was my first foray into blogging and I created it a good few years ago.

Nowadays their blogging software is much better and unlike the free wordpress.com blogs they actually allow you to upload images into the header, show adverts and customise the CSS of your blog for free.

After trying to set up 3 wordpress.com blogs that all had the word "horseracing" in the domain which were all instantly banned (without any kind of reason) I set this one up on blogspot: UK Horse Racing Star without any problems at all.

As you can see it's a lot better looking than this old blog plus it took me less than 20 minutes to style it, upload a header image and set the adverts up.

From now on I think I will be sticking to blogspot as Automattic have shown that they don't care about the plugin developers who spend hours even weeks writing all the thousands of plugins that make their system work.

If they did care they would offer a marketplace like Joomla for developers to sell their hard work instead of cracking down on those plugin authors who want some kind of recompense for their development.

You can read the author of the PluginSponsors.com thoughts on the matter here Automattic Bullies.

Speeding up batch SQL Processes that use temporary table variables

Problems with TABLE variables and how to improve performance for SQL Batch Jobs


We recently moved our production system to SQL 2008. Everything seemed okay until we realised after a fortnight that a particular MS Agent job had not been completing correctly.

One of the steps within the job had been failing due to a missing SQL Login. This job transferred large amounts of data from daily tables to historical ones and a large backlog had now been created numbering in the tens of millions.

Due to the use of our batch delete process that removed data that had been transferred in small chunks (500 rows) to reduce locking the data had got to a stage where new records were being added almost as fast as we were removing them.

When I looked at the query that was doing the insert and then delete I saw that it was using a temporary table variable to hold the temporary data whilst it was being transformed.As this was now 20 million plus records it was a problem on it's own due to it's limitations.

This is a common SQL Performance bottleneck and one I have come across a few times now.

SQL's TABLE variables are very good for small datasets (a few hundred at most) and are very useful for array like usage within stored procedures. However when you move to large amounts of records they just become a performance nightmare.

Unlike proper temporary tables or fixed permanent tables you cannot add indexes to TABLE variables and with large record sizes this is usually a must. The constant table scans the process must have been having to do to find rows must have been a major cause of the slow performance.

I rewrote the stored proc to make use of a temporary table, added a clustered index on the columns I was using for my joins to the real table and I used a batch DELETE process using the TOP (@X) statement to remove old records.

This has sped the process up immensely but the automatic creation of indexes and creation and dropping of tables requires permissions higher than those that the website login I use for my site has.

Therefore to ensure everything ran smoothly I had to use the WITH EXECUTE AS 'login' statement to allow the user to impersonate a login with higher privileges.

A cut down example of the proc is below and it shows the following:

  1. Impersonating logins with higher permissions to allow for DDL statements e.g CREATE and DROP.
  2. Checking TEMP DB for existing temporary tables and indexes and dropping them if they exist.
  3. Deleting large numbers of records in batches to prevent blocking.
  4. Ensuring a SARGABLE where clause is used instead of DATEDIFF to get yesterdays data



CREATE PROCEDURE [dbo].[usp_sql_job_data_transfer] 
 @Success BIT = 0 OUTPUT,
 @RowsCopied INT = 0 OUTPUT,
 @RowsDeleted INT = 0 OUTPUT

WITH EXECUTE AS 'admin_user'
AS
  BEGIN

  SET NOCOUNT ON
  SET DATEFORMAT YMD

  DECLARE @Stamp datetime, 
   @InsertDate datetime,
   @TmpRows int,
   @Rows int,  
   @RowsInserted int,
   @error int
   

  -- check for existance of old temp table, a local temp table wont persist but we may need to use a global one (Example usage)
  IF object_id('tempdb..#JobHits') IS NOT NULL
    BEGIN
    -- drop it
      
    DROP TABLE #JobHits
    END 
 
  -- create temp table   
  CREATE TABLE #JobHits(
   JobFK int, 
   Stamp Datetime, 
   ViewHits int, 
   SearchHits int
  )

  -- job runs after midnight and we want a SARGABLE WHERE clause
  SELECT @Stamp = CONVERT(datetime,CONVERT(varchar,getdate(),23))

   
  SELECT @TmpRows = 0, @RowsInserted = 0, @RowsDeleted = 0  

  -- insert into my temp table the search hits by date (SQL 2005)
  -- if this proves to be still slow or causing blocks then try inserting in batches of 1000 OR 20 min chunks
  INSERT INTO #JobHits
  (JobFK, Stamp, ViewHits, SearchHits)
  SELECT JobFK, CONVERT(datetime,CONVERT(varchar,Stamp,23)),0, count(jobFk) 
  FROM JOBS_DATA with (nolock)
  WHERE HitType = 'S' 
   AND Stamp < @Stamp
  GROUP BY JobFK, CONVERT(datetime,CONVERT(varchar,Stamp,23))

  SELECT @TmpRows = @@ROWCOUNT, @Error = @@ERROR
  
  IF @Error <> 0
    GOTO HANDLE_ERROR

  -- insert into my temp table view hits
  INSERT INTO #JobHits
  (JobFK, Stamp, ViewHits, SearchHits)
  SELECT JobFK, CONVERT(datetime,CONVERT(varchar,Stamp,23)),count(jobFk), 0
  FROM JOBS_DATA with (nolock)
  WHERE HitType = 'V' AND 
   Stamp < @Stamp -- SARGABLE WHERE CLAUSE - note I am not using DATEDIFF
  GROUP BY JobFK, CONVERT(datetime,CONVERT(varchar,Stamp,23))

  SELECT @TmpRows = @TmpRows + @@ROWCOUNT, @Error = @@ERROR


  -- if an error occurred jump to the error handler
  IF @Error <> 0 
    GOTO HANDLE_ERROR
  --If no error but no rows then its just a bad day for the site with no hits
  ELSE IF @TmpRows = 0  
    GOTO EXIT_PROC
  ELSE
    BEGIN
     
    -- add an index to aid lookups and searching 

    -- ensure no record exists already in Temp DB
    IF object_id('tempdb..clidx_#JobHits') IS NOT NULL 
      BEGIN
      
     DROP INDEX clidx_#JobHits ON #JobHits

      END

       
    -- Add a clustered index to help searching - cover the main join column JobFk and group by column Stamp 
    CREATE CLUSTERED INDEX [clidx_#JobHits] ON #JobHits
    (
     [JobFK] ASC,
     [Stamp] ASC    
    )WITH (PAD_INDEX  = OFF, STATISTICS_NORECOMPUTE  = OFF, SORT_IN_TEMPDB = OFF, IGNORE_DUP_KEY = OFF, ONLINE = OFF, ALLOW_ROW_LOCKS  = ON, ALLOW_PAGE_LOCKS  = ON, FILLFACTOR = 100) ON [PRIMARY]
    
    
    END
  
  
  -- join my temp table to the main table to get info about the company the job belonged to
  INSERT INTO JOBS_DATA_HISTORY
  (JobFk,Stamp,ViewHits, SearchHits, ClientFK)
  SELECT a.JobFK, a.Stamp, Sum(a.ViewHits), Sum(a.SearchHits), j.ClientFK
  FROM  #JobHits as a
  JOIN  JOBS as j with (nolock)
   ON  a.JobFk = j.JobPK
  GROUP BY a.JobFK, a.Stamp, j.ClientFk

  SELECT @RowsInserted = @@rowcount, @error = @@ERROR
  

  --if we are here then there must be > 0 rows otherwise we would have exited earlier
  IF @RowsInserted=0 or @Error<>0
    GOTO HANDLE_ERROR
 

  -- Now we have copied our data we need to delete it all from the daily table
  -- as this table is being used by the site still we delete in batches to prevent blocking locks
  SELECT @Rows = 1
  
  -- loop until no more rows are left
  WHILE @Rows > 0
    BEGIN 
    -- delete data in table using the TOP command to ensure we only delete on our indexed column in batches
    DELETE TOP(1000)
    FROM JOBS_DATA
    WHERE Stamp < @Stamp -- indexed column in JOBS_DATA
      
    SELECT @Rows = @@ROWCOUNT, @RowsDeleted = @RowsDeleted + @Rows, @Error = @@ERROR

    -- should i wait for a second in between batches? cannot find a definitive answer
    -- if problems occur (which they haven't so far) then try this
    --WAITFOR DELAY '00:00:01' 
    
    IF @Error <> 0 
      GOTO HANDLE_ERROR
    END

  --if no rows were deleted then something went pete tong
  IF @RowsDeleted=0
    GOTO HANDLE_ERROR
  END

-- clean up
CLEAN_UP:
    
   -- add an index to aid lookups and searching (not sure if when table is dropped the index record remains or not so being safe - check this!)
  IF object_id('tempdb..clidx_#JobHits') IS NOT NULL 
    BEGIN
  -- drop our clustered index
  DROP INDEX clidx_#JobHits ON #JobHits

    END
    
  IF object_id('tempdb..#JobHits') IS NOT NULL
    BEGIN
  -- drop our temp table  
  DROP TABLE #JobHits
    END 
  
  
  -- jump over error handler and exit 
  GOTO EXIT_PROC
  

--handle error
HANDLE_ERROR:

  SELECT @Success = 0

  IF @FromProc = 0
    SELECT @Success as success, COALESCE(@RowsCopied,0) as rowscopied,COALESCE(@RowsDeleted,0) as rowsDeleted

  -- exit
  RETURN 0



--handle a successful exit
EXIT_PROC:
  SELECT @Success = 1, @RowsCopied = @RowsInserted

  -- exit
  RETURN 1

Sunday 13 November 2011

Performance Tuning Tools for MySQL on LINUX

How to Performance Tune MySQL for Wordpress

Since I have been working a lot with Wordpress, PHP and Apache I have had to get used to the limitations of MySQL and the applications that are available to connect to MySQL databases such as Navicat or PHPMyAdmin.

As well as all the missing DML such as CTE's I really miss the very useful Data Management Views (DMV's) that has as they make optimising a database very easy and I have built up a large collection of SQL Server Performance Reports and tools to help me debug database performance issues.

Since working with Wordpress I have seen a lot of Plugins that use very poor SQL techniques and it seems most plugin authors that create their own tables in the Wordpress database don't even think about indexes that could increase performance of some very badly written SQL.

Having to sift through the slow query log and then run an EXPLAIN on each query is a time consuming job whereas setting up a scheduled job to monitor missing indexes and then list all suggestions is very easy to do in MSSQL with their Data Management Views (DMV's) that hold information about missing indexes, high cost queries, cached query plan re-use and much more.


My SQL Optimisation for Wordpress and LINUX

There are a number of tools available for performance tuning MySQL and I have listed a few below.


MySQLTuner.pl

MySQLTuner is a Perl script that analyzes your MySQL performance and, based on the statistics it gathers, returns recommendations based on the ShowVariables SQL that you can adjust to increase performance.

One of the good things about having root access to your own LINUX server (VPS or dedicated) is the ability to SSH in and then load and install programs remotely with a few lines of code from the command prompt.

To load and run MySQLTuner.pl on your server do the following:

Change the directory to your program folder e.g:

cd /usr/bin

Load the tool in remotely with a WGET command e.g:

wget http://mysqltuner.pl/mysqltuner.pl

Change the permissions to make it executable e.g:


chmod +x mysqltuner.pl

Run the tool e.g

myhost:/usr/bin# /usr/bin/mysqltuner.pl

>>  MySQLTuner 1.2.0 - Major Hayden 
>>  Bug reports, feature requests, and downloads at http://mysqltuner.com/
>>  Run with '--help' for additional options and output filtering
Please enter your MySQL administrative login: root
Please enter your MySQL administrative password:

-------- General Statistics --------------------------------------------------
[--] Skipped version check for MySQLTuner script
[OK] Currently running supported MySQL version 5.0.51a-24+lenny5-log
[OK] Operating on 64-bit architecture

-------- Storage Engine Statistics -------------------------------------------
[--] Status: +Archive -BDB +Federated -InnoDB -ISAM -NDBCluster
[--] Data in MyISAM tables: 419M (Tables: 98)
[!!] Total fragmented tables: 9

-------- Security Recommendations  -------------------------------------------
[OK] All database users have passwords assigned

-------- Performance Metrics -------------------------------------------------
[--] Up for: 3d 3h 0m 33s (5M q [22.010 qps], 58K conn, TX: 102B, RX: 1B)
[--] Reads / Writes: 87% / 13%
[--] Total buffers: 314.0M global + 2.6M per thread (100 max threads)
[OK] Maximum possible memory usage: 576.5M (55% of installed RAM)
[OK] Slow queries: 1% (70K/5M)
[OK] Highest usage of available connections: 16% (16/100)
[OK] Key buffer size / total MyISAM indexes: 64.0M/196.8M
[OK] Key buffer hit rate: 100.0% (18B cached / 5M reads)
[OK] Query cache efficiency: 83.1% (4M cached / 5M selects)
[!!] Query cache prunes per day: 61605
[OK] Sorts requiring temporary tables: 0% (207 temp sorts / 356K sorts)
[!!] Joins performed without indexes: 35147
[!!] Temporary tables created on disk: 44% (309K on disk / 690K total)
[OK] Thread cache hit rate: 99% (329 created / 58K connections)
[!!] Table cache hit rate: 13% (191 open / 1K opened)
[OK] Open file limit used: 22% (231/1K)
[OK] Table locks acquired immediately: 99% (1M immediate / 1M locks)

-------- Recommendations -----------------------------------------------------
General recommendations:
    Run OPTIMIZE TABLE to defragment tables for better performance
    Adjust your join queries to always utilize indexes
    When making adjustments, make tmp_table_size/max_heap_table_size equal
    Reduce your SELECT DISTINCT queries without LIMIT clauses
    Increase table_cache gradually to avoid file descriptor limits
Variables to adjust:
    query_cache_size (> 40M)
    join_buffer_size (> 128.0K, or always use indexes with joins)
    tmp_table_size (> 200M)
    max_heap_table_size (> 200M)
    table_cache (> 200)



You should carefully read the output, especially the recommendations at the end.

It shows exactly which variables you could adjust in the [mysqld] section of your my.cnf (on Debian and Ubuntu the full path is /etc/mysql/my.cnf). but be careful as this is only advice and you should find out as much as you can before changing core configuration settings.

Whenever you change your my.cnf file, make sure that you restart MySQL with this command (or use your GUI)

/etc/init.d/mysql restart


You can then run MySQLTuner again to see if it has further recommendations to improve the MySQL performance.

Another tool of a similar nature is the MySQLReport tool which can be found at http://hackmysql.com.

Information can be found here about how to read and analyse the report that is produces from this link http://hackmysql.com/mysqlreportguide.

You can load it up remotely and build it on your server in a similar way making use of an HTTP tool like CURL or WGET etc:


wget hackmysql.com/scripts/mysqlreport

--2011-11-13 02:58:47--  http://hackmysql.com/scripts/mysqlreport
Resolving hackmysql.com... 64.13.232.157
Connecting to hackmysql.com|64.13.232.157|:80... connected.
HTTP request sent, awaiting response... 200 OK
Length: 38873 (38K) [application/x-perl]
Saving to: `mysqlreport'

100%[======================================>] 38,873      --.-K/s   in 0.1s

2011-11-13 02:58:47 (254 KB/s) - `mysqlreport' saved [38873/38873] 
 


Once loaded give the newly installed file execute permission with the following command

chmod +x mysqlreport1.pl

You then call it by passing through the details of the system you want to analyse e.g:

mysqlreport --user root --host localhost --password mypsw100


MySQL 5.0.51a-24+lenny5  uptime 3 3:21:16       Sun Nov 13 03:04:11 2011

__ Key _________________________________________________________________
Buffer used    52.33M of  64.00M  %Used:  81.76
  Current      61.15M            %Usage:  95.55
Write hit      99.96%
Read hit       99.97%

__ Questions ___________________________________________________________
Total           5.96M    22.0/s
  QC Hits       4.60M    17.0/s  %Total:  77.25
  DMS           1.07M     3.9/s           17.97
  Com_        226.52k     0.8/s            3.80
  COM_QUIT     58.22k     0.2/s            0.98
  +Unknown        408     0.0/s            0.01
Slow (2)       70.43k     0.3/s            1.18  %DMS:   6.58  Log:  ON
DMS             1.07M     3.9/s           17.97
  SELECT      935.60k     3.4/s           15.70         87.35
  UPDATE      127.41k     0.5/s            2.14         11.90
  INSERT        7.63k     0.0/s            0.13          0.71
  DELETE          450     0.0/s            0.01          0.04
  REPLACE           0       0/s            0.00          0.00
Com_          226.52k     0.8/s            3.80
  set_option  169.55k     0.6/s            2.84
  change_db    56.71k     0.2/s            0.95
  optimize         91     0.0/s            0.00

__ SELECT and Sort _____________________________________________________
Scan           68.94k     0.3/s %SELECT:   7.37
Range          42.09k     0.2/s            4.50
Full join      35.21k     0.1/s            3.76
Range check         0       0/s            0.00
Full rng join       0       0/s            0.00
Sort scan     300.33k     1.1/s
Sort range     56.97k     0.2/s
Sort mrg pass     207     0.0/s

__ Query Cache _________________________________________________________
Memory usage   36.03M of  40.00M  %Used:  90.07
Block Fragmnt  10.04%
Hits            4.60M    17.0/s
Inserts       842.73k     3.1/s
Insrt:Prune    4.33:1     2.4/s
Hit:Insert     5.46:1

__ Table Locks _________________________________________________________
Waited          1.44k     0.0/s  %Total:   0.08
Immediate       1.77M     6.5/s

__ Tables ______________________________________________________________
Open              200 of  200    %Cache: 100.00
Opened          1.54k     0.0/s

__ Connections _________________________________________________________
Max used           16 of  100      %Max:  16.00
Total          58.52k     0.2/s

__ Created Temp ________________________________________________________
Disk table    310.39k     1.1/s
Table         381.99k     1.4/s    Size: 200.0M
File              431     0.0/s

__ Threads _____________________________________________________________
Running             1 of    1
Cached              7 of    8      %Hit:  99.44
Created           329     0.0/s
Slow                3     0.0/s

__ Aborted _____________________________________________________________
Clients           588     0.0/s
Connects           17     0.0/s

__ Bytes _______________________________________________________________
Sent          102.63G  378.3k/s
Received        1.95G    7.2k/s

__ InnoDB Buffer Pool __________________________________________________
Usage               0 of       0  %Used:   0.00
Read hit        0.00%
Pages
  Free              0            %Total:   0.00
  Data              0                      0.00 %Drty:   0.00
  Misc              0                      0.00
  Latched           0                      0.00
Reads               0       0/s
  From file         0       0/s            0.00
  Ahead Rnd         0       0/s
  Ahead Sql         0       0/s
Writes              0       0/s
Flushes             0       0/s
Wait Free           0       0/s

__ InnoDB Lock _________________________________________________________
Waits               0       0/s
Current             0
Time acquiring
  Total             0 ms
  Average           0 ms
  Max               0 ms

__ InnoDB Data, Pages, Rows ____________________________________________
Data
  Reads             0       0/s
  Writes            0       0/s
  fsync             0       0/s
  Pending
    Reads           0
    Writes          0
    fsync           0

Pages
  Created           0       0/s
  Read              0       0/s
  Written           0       0/s

Rows
  Deleted           0       0/s
  Inserted          0       0/s
  Read              0       0/s
  Updated           0       0/s


In a similar way you will need to know how to analyse each section to make the neccessary changes and you can find out what each calculation result means here: http://hackmysql.com/mysqlreportdoc

Another good tool to use to find out what is going on in your database is MyTop. This is an application like Top that shows the current processes running on a server but for a MySQL database rather than the whole server. It does this by analysing the same data that SHOW PROCESSLIST would output.

You can obtain the code from http://jeremy.zawodny.com/mysql/mytop/

Once loaded you call it from your command prompt like the other commands passing in the host, database and password as well as a refresh rate if you require it. I think it defaults to 5 seconds but I like to use 1 second which you can change by supplying a parameter for --s parameter e.g:

mytop --user root --password mypsw --db myDB --s 1

The results look like this with a header that shows how many queries have run per second as well as slow queries from the slow quert log.

MySQL on localhost (5.0.51a-24+lenny5-log)                                 up 3+03:48:10 [03:31:05]
 Queries: 5.7M   qps:   22 Slow:     0.0         Se/In/Up/De(%):    77/00/00/00
             qps now:    2 Slow qps: 0.0  Threads:    3 (   3/   5) 00/00/00/00
 Key Efficiency: 100.0%  Bps in/out:   0.0/  0.6   Now in/out:  41.2/10.0k

      Id      User         Host/IP         DB      Time    Cmd Query or State
      --      ----         -------         --      ----    --- ----------
   58766      root       localhost strictly         0  Query show full processlist
   58772 darkpolit       localhost strictly         1  Query select CONCAT('http://www.strictly         
   58771 strictly        localhost strictly         4  Query select CONCAT('http://www.strictly         

Another method for those who don't have direct access into their LINUX server but only a control panel like CPANEL and use Wordpress for their website is my own Wordpress Plugin - the Strictly System Checker Plugin.

The Strictly System Checker is a Wordpress plugin that is designed to allow webmasters to monitor their site at regular intervals throughout the day and to be notified if the site goes down or experiences database problems or high server loads.

This plugin was not designed to be a replacement for professional server monitoring tools however it is a nice easy to use system that can aid webmasters in monitoring their Wordpress site as well as notifying the right person whenever the site is down or running into performance problems.

How it works

  • A CRON / WebCron job initiates an HTTP request to check whether the site can be accessed.
  • The system will check for the Error establishing a database connection error message as well as searching for an optional piece of text which can help indicate whether the page has loaded correctly.
  • If the error is found or the text cannot be found then a connection to the database is attempted.
  • If successful a CHECK and REPAIR is carried out on any tables that maybe corrupted.
  • An option exists to also check for fragmented tables and an OPTIMIZE command is carried out to fix any found.
  • An SQL report is carried out to report on some key performance indicators such as the number of connections, queries, reads, writes and more.
  • A report is carried out on the webserver to look at the current server load average and if it's above a specified threshold a report to the site administrator can be triggered.
  • A similar check is carried out on the database to ensure that there are no slow running queries or the connection limit hasn't been reached.
  • If problems are found an email is then sent to the site administrator with details of the report.
An example of a report that can be emailed to system administrators or viewed from the admin part of the website is below.

System Report: 2011-11-13 04:07:07

Initiating System Report...
Initiating an HTTP request to http://www.strictly-software.com
The HTTP request to http://www.strictly-software.com took 0 second(s) to respond and returned a status code of 200
The specified search text [read more] was found within the HTTP response
The server load is currently 0.48
The server load is okay
MySQL has been running for: 26 days 4 hours 18 mins 48 secs
Total Connections: 467253 - Aborted: 0 - Connections Per Hour 743
Total Queries: 21592142 - Queries / Per Hour 34365
Joins without indexes: 0 - Joins without indexes Per Hour 0
Total Reads: 4819507 (88%) - Total Writes 630029 (12%)
The system is currently configured to accept a maximum of 100 database connections
At the time of reporting the database was running 3 query
The current database load is 3%
The database load is okay
Initiating a check for fragmented tables and indexes
Optimized table: wp_options
Optimized table: wp_postmeta
Completed check for fragmented tables and indexes
The system report has completed all its tests successfully.


Report Completed At 2011-11-13 04:07:07

Strictly Software Plugins for Wordpress


So whilst I feel the tools available for MySQL performance monitoring are lacking when compared with those available with SQL 2008 there are some that will enable you to get the job done. Hopefully this article has been helpful and if you have any tools of your own please add them to t the comment section.

Thursday 10 November 2011

Error copying tables importing or exporting the Geography or Geometry data type in SQL 2008

Error importing and exporting the Geometry and Geography data types in SQL 2008

Today I had to make some work live that involved copying a database containing UK Postcodes and their related geo-location data to another SQL 2008 server.

The table contained a list of all UK Postcodes as well as their longitude and latitude and a GeoLocation column that was based on the new SQL 2008 DataType Geography.

However when I tried to use the Import/Export wizard to copy the data I got to the Review Data Type Mapping page and was met with the following error message.


Found 1 unknown column type conversion(s) You are only allowed to save the package.


Import Geography Error Screenshot

Viewing the full error details revealed that the problem was down to SQL not understanding the Geography data type in the table I wanted to export.


The full error message details are below:



[Source Information]
Source Location : (local)
Source Provider : SQLNCLI10
Table: [dbo].[Country]
Column: CountryGeography
Column Type: geography
SSIS Type: (Type unknown ...)
Mapping file (to SSIS type): C:\Program Files\Microsoft SQL Server\100\DTS\MappingFiles\MSSQLToSSIS10.XML

[Destination Information]
Destination Location : (local)
Destination Provider : SQLNCLI10
Table: [dbo].[Country]
Column: CountryGeography
Column Type: geography
SSIS Type: (Type unknown ...)
Mapping file (to SSIS type): C:\Program Files\Microsoft SQL Server\100\DTS\MappingFiles\MSSQLToSSIS10.XML

[Conversion Steps]
Conversion unknown ...
SSIS conversion file: C:\Program Files\Microsoft SQL Server\100\DTS\binn\DtwTypeConversion.xml


First of all I checked that the database compatibility mode was set to 100 (SQL 2008) and not 90 (SQL 2005) and once I had confirmed both databases were the correct format I checked the mapping conversion XML file on the server I was doing the import from.

This XML mapping file is located at the following path C:\Program Files\Microsoft SQL Server\100\DTS\MappingFiles\MSSQLToSSIS10.XML.

On opening the file I could see that there was no mention of the geography OR geometry data types which explained why the DTS package wizard could not carry out the operation.

To fix this I copied one of the other similar data types (varbinary) and re-inserted it into the file twice before changing the names to Geography and Geometry.

You can just copy and paste the following XML into the file.


  <!-- geography -->
  <dtm:DataTypeMapping >
    <dtm:SourceDataType>
      <dtm:DataTypeName>geography</dtm:DataTypeName>
    </dtm:SourceDataType>
    <dtm:DestinationDataType>
      <dtm:SimpleType>
        <dtm:DataTypeName>DT_IMAGE</dtm:DataTypeName>
      </dtm:SimpleType>
    </dtm:DestinationDataType>
  </dtm:DataTypeMapping>




  <!-- geometry -->
  <dtm:DataTypeMapping >
    <dtm:SourceDataType>
      <dtm:DataTypeName>geometry</dtm:DataTypeName>
    </dtm:SourceDataType>
    <dtm:DestinationDataType>
      <dtm:SimpleType>
        <dtm:DataTypeName>DT_IMAGE</dtm:DataTypeName>
      </dtm:SimpleType>
    </dtm:DestinationDataType>
  </dtm:DataTypeMapping>


Save the MSSQLToSSIS10.XML file and you should now be able to import or export a table that contains the datatypes geography and geometry.

If for some reason after editing the file it still doesn't work, try the following:

  • restarting the SQL Server service on the server you are running the export/import wizard from.
  • restarting your own SQL Server Client Tools.
  • If you are running a 64bit version of Windows 7 like I am then you might need to also edit the same file in the 32 bit Program folder e.g C:\Program Files (x86)\Microsoft SQL Server\100\DTS\MappingFiles\MSSQLToSSIS10.XML.