rebelpeon.com

Thursday, November 30, 2006

The Site and Other Updates

Ok, so the site is back up.  That was incredibly easy.  As far as I can tell, there was 0 downtime.  The only thing that may have looked strange is that I didn’t have the most current database on the dreamhost server.  But, a simply backup and restore fixed that.  Also, aaron spruit (.com) is up and running on the new host also.  I realized that the two databases could really cause headaches if I didn’t do them both now, especially if I added an entry on both sites.  I’ve updated my registrar with the new name servers, but since that takes awhile to propagate, I updated my current names servers to use the new hosted site, instead of the other one.  That way everyone hits the new site right away, instead of some going one place and others going to another place.

In other news, now that I have this running under Apache, I’m going to start playing with the .htaccess file.  What I’m hoping to accomplish is to get rid of all the extraneous crap on the URLs to this site.  Therefore, if you are reading this through RSS, the feed may go down.  Just come visit the site and grab the new URL of it.  It will probably end up being http://www.rebelpeon.com/rss_2.0/ though, instead of http://www.rebelpeon.com/index.php?/rebelpeon/rss_2.0/.  This should also help with search sites too, since they tend to have a hard time with the URL Query Strings.

posted by aaron at 10:38 PM
posted in rebelpeon.com • (0) commentspermalink

Site Downtime

I’m hoping it won’t be for too long, or at all, but I’m going to attempt to finish the migration of my site to dreamhost.  It should progress without any downtime, but knowing my luck, there will be.  However, aaron spruit (.com) will be remain at the same place until everything is working smoothly with this site.  Moving aaron spruit (.com) won’t take much at all as both of these sites run off the same CMS and MySQL database.  It’s just moving the first one that’s a pain.

You’ll know when everything is propagated because there should be dreamhost banner on the bottom of the new site.

posted by aaron at 10:33 AM
posted in rebelpeon.com • (0) commentspermalink

Monday, July 24, 2006

CSS Problems

Well, I’ve started a new redesign of the site, but I’m having a few problems that maybe someone could help me with.  Basically, on the title, I can’t get it to sit 5px off the bottom.  As you can see, the title sits 10px to the left like I want, but it won’t sit 10px above the bottom.  If you go to my test page and increase and decrease the size of the page font you’ll see what I’m talking about.

The full CSS for this is available, but the section in question is given below.

#heading .title
{
color:#FFF;
font-size: 2em;
position: absolute;
bottom:10px;
left:10px;
}

Is this a normal feature of CSS?  Basically, I want the text to sit 10px above the bottom of the dark background regardless of what size the font in increased to or decreased.  If you look at an entry at the bottom, that text is always 10px away from the border, why can’t the title section?  Yes, I know that it’s handled differently, but that’s the look I desire.

posted by aaron at 10:39 AM
posted in webrebelpeon.com • (1) commentspermalink

Friday, July 14, 2006

New Domain

Well, I’ve got it in my mind that I want to redo my website (again).  It’s been awhile, and well, I just feel like it needs a bit of a refresh.  And hey, it’s been awhile.  So, I’ve been freshening up on some CSS, so that I can do it right.  I’m probably going to keep the graphics down to a minimum, like it currently is, just to keep load times, but at the same point, I really like some of designs at CSS Zen Webgarden. 

Anyways, I also want to break the photos off this site and create a new one.  Something similar to Chromasia, or G8, or Staring at the Sun.  Basically, since I’ve been getting better, I want someplace to showcase the best of the best. 

Really, what it comes down to, is that I need something even more to do, right…

So, I’ve registered aaronspruit.com in addition to rebelpeon.com.  However, now I have the dilemma of which site should go where.  I can come up for pros and cons of both.  What are all your thoughts?  Currently they both point here.

posted by aaron at 02:43 PM
posted in photographywebrebelpeon.com • (5) commentspermalink

Monday, May 15, 2006

What I Learned

This is essentially part two of the post mortem on the server failure.  The first post was basically just me outlining exactly what happened, while this post will be about what I’ve learned.

1.  System State backups are not the greatest thing in the world.  In fact, they are pretty much useless except for a few key situations.  Basically, in all of the Microsoft Press Books for the MSCE tests (and well, just about any other study material), system state backups are thought of as Gods gift to backups.  In reality this is hardly the case.  In fact, after doing system state backups on all of my servers, the only one that actually worked after a restore is the domain controller.  Granted, this was because there was nothing else on the machine.

All the other machines had software installed when the backups were taken, and now after restoring the system state, the machines are in a weird state where they have all the registry entries for software that isn’t physically on the machine (registry gets restored).  Now, this would be great if I had backed up the whole machine, but I didn’t.  Oh, and don’t even get me started with a system state restore and IIS.  Put simply, your metabase that is restored from the system state, won’t function on your new server, because your machine crypto key is different.

2.  The physical network at the apartment is a mess, and it definitely limits our ability to do a lot of things.  It seems to be further limiting my ability to create a perimeter and internal network with ISA.  For an unknown (as of yet) reason, anything not connected to the bridge/switch that my ESX box is connected to, can not ping the 192.168.2.0/24 network which resides as a virtual switch on the ESX box, even with the static routes set.  What’s really making this aggravating is that if I initiate a ping from the 192.168.2.0/24 network to a specific machine in the 192.168.1.0/24 network, then everything works fine until that tunnel through ISA is closed.  However, once that tunnel is closed, nothing even hits the ISAs external interface, so it’s not really a tunnel through ISA, but a mapped route that’s appearing and disappearing.  Annoying to say the least.  If you feel like you want to help, or see a better explanation, feel free to check out my thread over at isaserver.org.

3.  WinSCP.  I can’t believe I haven’t been using this app with ESX before.  Setting up FTP can be a pain, and is a security hole, so being able to easily upload ISOs or whatever to the ESX box has been unbelievably helpful.

4.  Linux.  It’s amazing how much easier it is to learn things when you actually have a reason to, like when it’s broken.  Unfortunately, with a lot of the original problems I had I wasn’t able to reference them on google.  However, after thinking about it for a bit, and using basic troubleshooting skills, I’ve been able to solve all the linux problems.  Thankfully.

5.  The new Perc controller rocks.  The site is noticeably more performant, and it doesn’t take forever to initialize an array.  It’s amazing what a generation later and 112 MB of cache can do for you.

posted by aaron at 11:14 AM
posted in webrebelpeon.com • (0) commentspermalink

Friday, May 12, 2006

Post Mortem

Anyways, it’s alive.  It may have taken a little longer than expected, but it’s back.  Hopefully. 

I’ve rebuilt all the virtual machines, mostly from backups, so I didn’t actually lose anything, but I’ve also changed a lot of the layout behind the scenes.  This, along with ordering new parts, and the rest of life, has kept the site off longer than I would’ve liked, but so is life without enterprise level machines and support.

So now it’s time for a post mortem on all this fun stuff.

The week of April 17th is when this story will start.  Basically, the website kept going down and the server hosting it became unresponsive to everything but ping.  I couldn’t SSH into the box or actually log in AT the box or anything.  So, I’d simply restart it.  After this happened a few times, I started scouring the logs to see what exactly was going on.  Basically, I couldn’t find anything.  As you can remember from a previous post, I thought that I had the problem licked.  However, I had never actually seen an error message or anything telling me exactly what was going on.  I was just going on gut instinct. 

So after figuring I fixed the problem, I went on with life, and it did work for quite a few days.  And then it started happening again.  So I decided to reinstall ESX thinking it may be a problem with that.  It still hung a few times, and since I couldn’t actually log in at the box, I decided to log in as soon as I rebooted the server and just leave it logged in.  Maybe something was being written to the display before it hung.  Well, the server worked for awhile, and then sometime on Sunday the 28th it went down again.

At the time I wasn’t at home, and had to wait until I got home, which was around 10 PM.  I go to the machine, and sure enough, I have the first actual error I’ve seen.

SCSI Host 0 Reset (PID 0)
Time Out Again ---

So, looking at the error, I thought that it may be the hard drive on SCSI ID 0.  Looking back, this was the first sign as to what was actually wrong.  I then replaced the hard drive and boot it back up.  The machine doesn’t go anywhere.  No ESX, no nothing other than trying to boot from the NIC.  Definitely not a good sign.  This was a RAID 5 setup, it should’ve recreated the array and everything should’ve been fine after I replaced the hard drive.  Well, apparently it didn’t want to do that, but it was too late now.  This was sign number two as to what the true problem was.  It was now 2 AM on Sunday, with work the following day, so I turned everything off and gave up for the night.

The following day I attempt to fix it again.  Since I still wasn’t sure what was going on, and I wanted to rule out the RAM, I ran MemTest86+ on the machine for a few hours.  No problems found.  I tried to do an upgrade with ESX, but ESX told me it couldn’t find any of the old partitions or installs.  Great.  Well maybe it’s just the partition table that’s gone, and not all the data.  I found this great utility CD called the ultimate boot CD.  On it there’s a program called TestDisk, which can salvage Linux partition tables.  After having to mess with the boot CD awhile to get the MegaRaid SCSI drivers installed on it, I was off and running.  Needless to say, that didn’t work, no partitions found.

Well, that means all the data’s essentially gone, since I was definitely not going to pay someone to get it back.  Thankfully I had started doing backups not more than 2 weeks prior to all this happening!

The rebuild of all the virtual machines then commenced.  However, with the server hanging it took quite awhile in order to get everything back up and running.  What made it even more interesting was the myriad of errors that each hang would create.  Honestly, I don’t think I saw the same error more than twice the whole time it was down.

During this time I also redid the setup to put all my machines in an Internal network behind an ISA server.  Right now there’s the external network (the internet), a perimeter network (my workstation, Binford’s workstation, and some misc machines that don’t need security), and then the internal network (all my enterprise level machines).  There is still one huge problem with this, but I’m still working on it, and it’s not a big deal.  Basically, from my workstation and Binford’s workstation you can’t ping the internal network unless the machine you’re trying to ping, pings out first.  It’s something to do with our messed up physical infrastructure, but hopefully I can fix it.

Basically, this whole time was to try and get the site and back-end up to where it was prior to the problems, and also fix what was wrong.  The more and more it happened, the more and more I thought it was the SCSI card.  So I changed the channel that all the drives were on, and rebuilt the array.  Needless to say that didn’t help much, and so this past Sunday I bought a new Dell Perc 3/DC card on ebay for $61 shipped.  Yesterday it came in, and last night I migrated all the virtual machines off, installed the new card, rebuilt the array, reinstalled ESX, migrated the virtual machines back on, and then brought the machines back on.

Right now we’re flying on the new Perc Card that has 112 MB more cache, and the ability to initialize an array in under 5 seconds as opposed to 100 minutes.  Hopefully we don’t see a hang.  Let’s all hold our breath, mkay?

posted by aaron at 05:18 AM
posted in webrebelpeon.com • (0) commentspermalink

Tuesday, April 25, 2006

Under Construction Page

Well, if you attempted to visit my page yesterday (probably even some into today), you were hit with an Under Construction page.  Basically, I did this so that I could put something up that explains what’s going on, and that it was that the server died, again.

At least I know it’s definitely not the logs filling it up.  However, I still have no clue what’s going on.  In trying to troubleshoot, I found that the CDROM drive on the server was bad though.  Thankfully I have another, which was pretty much identical.

I’ve also got a functioning monitor and keyboard hooked up to the machine right now, so hopefully I’ll be able to log onto it and see what’s going on.

Here’s hoping.

Oh yeah, and those with @rebelpeon.com email addresses, I had setup forwarding to another of your accounts, so you should’ve all gotten email during the downtime.  I’ve since switched it back to be delivered to rebelpeon.com correctly.  If it happens again, I’ll just keep switching them around so that there is no email downtime (or at the least very small amounts).

Update 4/25 10:33 PM
More downtime again.  I’ve finally reinstalled ESX to see what happens, but I still have no idea what’s going on.  I ran some memory tests, and that doesn’t seem to be the culprit either.  If the reinstall doesn’t fix it, the next thing I’m thinking is that it could be power related, so a new surge protector may be in order.

For those using email, it’s still forwarding to another address.  I’ve just setup the website so I can easily tell if it goes down again or not.

posted by aaron at 09:58 PM
posted in webrebelpeon.com • (0) commentspermalink
Page 1 of 7 pages  1 2 3 >  Last »