Sunday, July 16, 2006

Copying MySQL Tables

I really find this one liner very handy when copying MySQL tables between different hosts.

MySQL Failed to open log file (InnoDB)

One needs to be very careful before deleting files as once deleted the files are gone for good. A friend of mine today called as MySQL was failing to start on his server. Earlier when "cleaning" his server, he had deleted binary log files without fully understanding their role. When I logged in to his server, I noticed the following errors:

060716 17:21:53  mysqld started
060716 17:21:53 InnoDB: Started; log sequence number 0 1736000119
/usr/sbin/mysqld: File './mysql-bin.000002' not found (Errcode: 2)
060716 17:21:53 [ERROR] Failed to open log (file './mysql-bin.000002', errno 2)
060716 17:21:53 [ERROR] Could not open log file
060716 17:21:53 [ERROR] Can't init tc log
060716 17:21:53 [ERROR] Aborting

060716 17:21:53 InnoDB: Starting shutdown...
060716 17:21:55 InnoDB: Shutdown completed; log sequence number 0 1736000119
060716 17:21:55 [Note] /usr/sbin/mysqld: Shutdown complete

060716 17:21:55 mysqld ended


I deleted the entries from mysql-bin.index and MySQL started without a problem.

Friday, July 14, 2006

MyISAM to InnoDB

One thing that sucks really bad with MyISAM is table level locking which can cause some serious issues at times when INSERT/UPDATE/DELETE load is very high. If you are experiencing problems in your application associated with locking, try converting your tables to InnoDB.

I was asked today a question as to how one can convert a MyISAM table to InnoDB. I will blog about it in detail later but for right now here's one way:

-ALTER TABLE table_name ENGINE = InnoDB;

and here's another way:

- dump the table to a SQL file
- change table type from MyISAM to InnoDB in the dumped file
- disable keys (to speed up the process)
- SET AUTOCOMMIT to 0 (for speeding up the process)
- load the table
- SET AUTOCOMMIT to 1
- enable keys (the index is rebuilt at this point)

One is preferred over the other based on various factors such as time limit in which the conversion must be done.


Also see:

Thursday, July 13, 2006

Indexes, Low Index Selectivity and Index Performance

The other day I was talking to Jay about low index selectivity. One of the column in my users table had extremely low index selectivity (less than 0.0000005). However when running certain queries involving COUNT and utilizing that column, it seemed the index was being used as when I ran the same queries with IGNORE INDEX, the query execution time was significantly higher.

There have been posts earlier on PlanetMySQL that noted that in cases of extremely low index selectivity, MySQL's cost based optimizer will never use it.

Jay pointed out that using COUNT(column_name) in a query is a special case where even in case of extremely low selectivity, having an index speeds up the query. For all other queries not involving COUNT, the index will not be used if the selectivity was very low.

I will be posting the queries and further analysis in a later post.

You may want to see:

Monday, July 10, 2006

err-log or log-error

Today I was investigating why one of the MySQL server wasn't logging anything in the error log. By default the error logs are written in the data directory with a .err extension.

The "err-log" configuration option only works with safe_mysqld. If you try to specify this option in the configuration file under [mysqld] then the following error is generated.

[ERROR] /path/to/mysqld: unknown variable 'err-log=/path/to/error_log'


When this option is used, MySQL server doesn't report the error log file path for the "log_error" variable in the output of "SHOW VARIABLES"

To specify the error log file so that it will show up in SHOW VARIABLES, we need to use the "log-error" option.

Wednesday, July 05, 2006

Compiling MySQL Administrator on Solaris 10

I have been trying to compile MySQL Administrator on a Solaris 10 box with no success. I followed Ronald's notes (link later) on the compile issues related to mysql-gui-common but continued having problems.

So eventually I filed a bug report and yesterday it was verified. So hopefully it can be resolved soon.

Frank

Friday, June 23, 2006

MySQL Slaves Behind Master 2

In my earlier post, I touched on the subject of investigating reasons why MySQL slaves get behind master. In this post I will continue that discussion.

  1. Different Specs on the Master and Slave servers. This is especially important if we have a write intensive replication environment. If our master server is more powerful than the slave servers then there is a great chance that during peak times our slaves will start lagging behind. I have seen environments where the master server was twice as powerful as the slave server and slaves will fall considerably behind. If that is the case then the best solution is to upgrade the slave server so it is as powerful as the master server.
  2. Unoptimized Tables: Recently I was working in an environment where the tables weren't optimized ever since they were created. Optimized tables can give faster access to data vs. unoptimized tables especially if you variable length records and perform a lot of INSERT/UPDATE/DELETE operations. When we optimize tables using "OPTIMIZE TABLE" command, MySQL defragments the table's data files, sorts the index files and rebuilds the index statistics. The table is then stored as a single, continous page.

    From the MySQL Manual:

    OPTIMIZE TABLE should be used if you have deleted a large part of a table or if you have made many changes to a table with variable-length rows (tables that have VARCHAR, VARBINARY, BLOB, or TEXT columns). Deleted rows are maintained in a linked list and subsequent INSERT operations reuse old row positions. You can use OPTIMIZE TABLE to reclaim the unused space and to defragment the data file.
    Although running the "OPTIMIZE TABLE" statement regularly on the master and slave servers will improve performance, it will not ensure that the slaves are always up to date especially if huge differences in the specs exist between the master and the slave.

  3. Unoptimized RAID version on the slaves: If you're using RAID 5, you may want to seriously consider using RAID 10 as RAID 5 is expensive when it comes to disk operations and traffic between the disk and the controller.

    Many DBAs choose RAID 10 as their primary choice for databases espcially when the load on the databases gets high. One of the main reasons for this is that with RAID 10 there is a lack of parity to calculate as compared to RAID 5, therefore allowing for faster writes in RAID 10. For more information see the RAID post on my other blog or read the Wikipedia article on RAID.

    Using RAID 10 or RAID 01 will allow for faster writes (more throughput) which may help slaves stay relatively more up to date.
  4. Unoptimized Queries: This one is a no brainer as if you have unoptimized queries reading data, they are going to take longer and will use the server resources that would otherwise go into staying up to date with the master.
  5. Disable Key writes: If you do not perform reads on the slaves, you may want to disable the key writes on slaves as suggested by Boyd Hemphill in this discussion.
  6. InndoDB vs MyISAM: If our application is read intensive, using InnoDB can help in terms of table locking (Thanks Rick James). I recommend reading Peter's recent article about using MyISAM tables in production which was in response to Arjen's excellent article.
To be continued ...

Wednesday, June 21, 2006

Investigating reasons why slaves get behind master

Since MySQL uses asynchronous replication, slaves can get behind the master from time to time. Among the reasons a slave can get behind includes heavy query load (SELECTs) on slave servers. From time to time I come across a slave server that will start lagging behind during the peak hours causing the Seconds_Behind_Master value to go higher and higher. As the load settles, the slave eventually catches up. I have seen Seconds_Behind_Master value in excess of 19000. With a value so high it is clear that the slave is falling considerably behind in reading the relay log.

As Mike pointed out in a comment to my earlier post "seconds behind master only tells you the difference between when a statement was put into the relay log on the slave and when it was executed, it doesn't account for the time it takes for the statement to get from the master to the slave."

On a slave server, MySQL compares the TIMESTAMP value taken from the slave and subtracts the TIMESTAMP value present in the binary log and then adds the difference between the UNIX_TIMESTAMP value taken from the master and the slave. (See Seconds Behind Master weirdness if you are getting a Seconds_Behind_Master value of 18446744073709551615).

If we want to decrease the value of the Seconds_Behind_Master, we need to find the queries that take extraordinarily long to execute on both the master and the slave. To do this we can use the slow query log (also see slow query log manual page for the latest version).

We can enable the logging of slow queries by starting the mysqld with the

--log-slow-queries

option. If we want we can also specify the filename where the slow queries log should be kept.

--log-slow-queries=/path/to/slow_queries_log

.

By default, MySQL will create the slow query log as hostname-slow.log in the data directory.

Using another option long_query_time, we can specify the amount of time that is considered acceptable. For instance, if we specify

long_query_time = 5

in the my.cnf file then all queries that take longer than 5 seconds will be considered slow queries and therefore logged.

I believe, the default value for long_query_time is 10 seconds.

Once you have the slow queries log, you can read it with mysqldumpslow.

By investigating and optimizing the "slow queries" we can help our slaves not get too behind the master.

More to come. Comments, questions and suggestions are welcome.

Finding out how far behind are the slaves using Seconds_Behind_Master

When managing a large number of MySQL slaves, often one needs to view a summarized list of the seconds by which the slaves are falling behind. Here is a small script that will allow you to do just that.

#!/usr/bin/bash
cat /list/slaves | while read i ; do echo $i; mysql -h $i -e "SHOW SLAVE STATUS \G" | grep "Seconds_Behind_Master" ; done


To use it create a list of slave hosts and place it in /list/slaves or wherever you like. Then create a script /scripts/slave_behind_all and paste the above code in it.

Tuesday, June 20, 2006

Proper way to shutdown a slave

There is a right way to stop a MySQL slave server and a wrong way. The wrong way is to simply shutdown the mysql server which can cause problems when starting up the server and slave.

The proper way to stop a slave server is to first issue "STOP SLAVE" and then shutdown the server. Here's how you can shutdown slave on multiple MySQL hosts.

cat /lists/dbs | while read i ; do echo $i; /path/to/mysql -h $i -e "STOP SLAVE"; done


Here's another way (thanks Toby for the alternative one liner to run a command against multiple hosts):

xargs -n 1 mysql -e "STOP SLAVE" -h < /lists/dbs



Stopping the slave first will ensure that the replication thread is stopped before the server goes down.

Monday, June 19, 2006

Running a command on multiple MySQL hosts

I have posted a one liner that can allow you to run a command on multiple MySQL hosts at my programming oneliners blog.

New Job

I apologize for the lack of posting on my blog. Life has been pretty busy as last week I started a new job as MySQL DBA in NY at a leading photo blogging community.

I am very excited about this job and look forward to working with a great team that is really passionate about what they are doing.

These days I am trying to get settled in NY/NJ area.

As time goes by I will be posting more. At the moment, thanks for staying tuned.

Monday, June 05, 2006

MySQL Performance Tuning Presentations

Peter Zaitsev maintains a very nice MySQL performance tuning blog at MySQL Performance Blog where he also shares his perfomance tuning presentations which I highly recommend to any MySQL DBA. At the MySQL Users Conference 2006, I attended (and really enjoyed Peter's session). You can get the slides of that presentation from MySQLUC.com.

Thursday, May 25, 2006

Everything social

I originally wrote this post on May 24 but it has been sitting as a draft ever since.

Ever since I returned from the MySQL users conference, things have taken a turn for me. I find myself so incredibly busy with projects that often it gets past 3:00 AM before I get to hit the bed. Of course that means I haven't been able to keep up to date with my emails.

I have a couple of projects under the hood going on. For one, I am busy launching the next version of Adoppt which will make things much easier to understand ;). Also, I am busy in simplifying the navigation and working on the email and invite system.

When I built Adoppt, I had just one thing in mind, to make it as user focused and friendly as possible (to keep the spirit of Web 2.0) and to this day that remains my goal. Many thanks to all those who provided me with valuable feedback on the project. I agree it is one of the most ambitious projects that I have undertaken to date.

In addition to Adoppt, I have three other projects in the pipeline. All use MySQL of course.

First, someone recommended sometime ago that I add PlanetMySQL to my RSS reader. The problem is that while I tested out many RSS readers, I don't have one that I particularly like. The reason being that I always wanted to code a RSS aggregator with my own hands.

Call me crazy but that's what I love to do. Code.

So about 10:30 pm yesterday night I sat down with my wife Michelle thinking of a domain name for my RSS reader and aggregator. After about thirty minutes of brain storming, we finally settled with feedles.com.

After securing the domain, I modified my hosts file (so I won't have to wait for the DNS propagation) and started making my RSS aggregator.

Around 3:00 AM I felt the aggregator was in decent enough position for me to start adding RSS and ATOM feeds.

Today, I ran into an issue of sorting feed entries when retrieving from MySQL. Since the latest items in a feed are shown first (and entered first in the db) I couldn't immediately come up with a way of retrieving them in the exact order in which they were received from the feeds. At that point I had two options

1. come up with a way of taking the publication date in various feeds (in a huge variety of formats) and try to make sense out of that, or
2. create a batch system where everytime I start parsing feeds for new content, I assign it a batch number which is then stored with each entry retrieved. That way I can query by specifying "ORDER BY batch_id DESC, id"

Now I know there must be a better way of doing this (and I would love to hear about it). But the good thing is that for now it works great.

The second project I am working on is a simplified version of Adoppt that allows for social bookmarking. If you guessed that the domain for my project is socialbookmarking.org, then you are quite right. I have had the domain name for quite some time and so this weekend I finally decided to sit down and get it closer to completion as well. Thanks to Ruby on Rails, my wife and my work on Adoppt, I was able to extract and simplify the interface for SocialBookmarking.org in about two days of work.


In a nutshell, socialbookmarking.org allows a user to

1. make friends
2. see what their friends are bookmarking
3. subscribe to any member's bookmarks (without becoming a friend)
4. make blog posts
5. edit their bookmarks easily using "in place editors" and AJAX.

Now, I need to work on finalizing the preliminary chapters of my Pro Rails book and send it to the editor. Hopefully he will like what he sees (and reads) and I will move one step closer to getting the book published. And oh yes, all the people who were kind enough to be my beta readers, I have not forgotten you and will be fulfilling my end of the promise soon.

And the third project? Oh, well let's just wait for now and see. Hint: I am working on it with another Planet'eer.

For now, you can check out my social bookmarking home page.

Frank

Wednesday, May 17, 2006

Planet marks

There have been some really cool postings at PlanetMySQL that I have been wanting to read and blog about. However, since I have been really busy (temporarily), I haven't had the chance to do so. Following is the list of posts I really liked, or would like to read at some later point.

Phew! that's a long list. Many more cool posts that I haven't had the chance to read closely. I will blog about them later. Hopefully they will remain at PlanetMySQL until I have read them ;)


Thanks
Frank

Overview of MySQL Engine, Cluster, Performance Tuning

At the MySQL conference Timour Katchaounov presented one of my favorite sessions. I have my notes from the session speeding up queries posted on my blog.

Recently, Timour was kind enough to email me the links to his and some other talks at Google.

Overview of MySQL Engine and New Features

Cluster Talk by Stewart:

Performance tuning (Jay):

Thanks Timour!

Frank

How I work series.

I have been really enjoying the "How I work" series by Dave Rosenberg as it provides very good insight into how our peers work.

Here are some of my favorites (no particular order):
Brian Aker
Sheeri Kritzer
Jason Gilmore
Mike Olson

I have been really busy working on a few projects for clients and haven't been getting any free time. Hopefully this weekend I can reply to everyone who has been kind enough to email me.

Congratulations Markus!

Just wanted to take a moment to congratulate Markus on joining the MySQL web development team. As he says, it must feel great to work for MySQL. Man, I should have hung out with the "right guys" at the conference too ;)

Congratulations Markus, I am sure you will help MySQL conquer more heights.

- Frank

Saturday, May 13, 2006

MySQL DBA Job Leads

I have been receiving a lot of requests from all over the US regarding DBA jobs. If you are currently looking for some hot jobs, drop me an email by writing to softwareengineer99 at yahoo dot com. Some of the jobs are offering as much as 150K.

I would also be interested in hearing from you if you have a position open in your company.

Is there any site that helps connect MySQL DBAs with the open jobs worldwide?

Farhan / Frank

Open Source Software

Ronald recently asked to provide a list of Open Source software that I have used or recommend. Here's a list that I compiled in one sitting. Since I have been working with OSS for quite sometime now, this list is just a sample. As time permits, I will add more to this list.

1. Typo (RoR) [some issues with RoR 1.1]
2. phpMyAdmin
3. OSCommerce [Recommended if you're ready to make a lot of customizations]
4. Interchange (Perl) [Overkill for small projects but ideal for large scale ones. At one point, Google used to use Interchange to power their online store. Now they use OSCommerce.]
5. OWL [A fairly decent DMS (Document Management System]
6. Word Press
7. BugZilla
8. phpBugTracker: Ideal when a quick setup is needed.
9. phpBB: [Needs a lot of customizations]
10. Drupal [Ideal if you need a CMS (Content Management System) but have a small number of categories. Because Drupal loads the list of categories in memory, it's not recommended for sites with large number of categories]
11. Nucleus: A decent CMS that can be extended easily.
12. PHP Gallery
13. PHPDig: [A nice search engine to index and search small web sites. Indexing process is really slow]
14. Lucene (Java): [The king of search engines, IMO]
15. FCKeditor
16. XOOPS
17. Zope
18. Apache and MySQL (obviously)
19. mytop
20. AWStats
21. APF (Advanced Policy Firewall)
22. BFD (Brute Force Detection)
23. PHPNuke
24. phpAdsNew
25. PHPOpenChat



I am currently looking to evaluate/use the following very soon:

1. Lighty (Lighttpd)

I would also be interested in knowing what OSS other PlanetMySQL readers use.

Congratulations to Mike and Stewart for being the top on PlanetMySQL.

I still need to post photos from the conference and other notes from the sessions.

Also checkout Sheeri's post on "neat tricks". I agree with her that having additional details on PlanetMySQL such as "1 post a day" would be very nice.

Oh, and this post marks my 100th post on this blog. Yay!

- Frank/Farhan