Finding my XFS Bug
Thursday, October 6th, 2011Recently one of our servers had some filesystem corruption – corruption that has occurred more than once over time. As we use hardlinks a lot with link-dest and rsync, I’m reasonably sure the issue occurs due to the massive number of hardlinks and deletions that take place on that system.
I’ve written a small script to repeatedly test things and started it running a few minutes ago. My guess is that the problem should show up in a few days.
#!/bin/bash
RSYNC=/usr/bin/rsync
REVISIONS=10
function rsync_kernel () {
DATE=`date +%Y%m%d%H%M%S`
BDATES=""
loop=0
for f in `ls -d1 /tmp/2011*`
do
BDATES[$loop]=$f
loop=$(($loop+1))
done
CT=${#BDATES[*]}
if (( $CT > 0 ))
then
RECENT=${BDATES[$(($CT-1))]}
LINKDEST=" --link-dest=$RECENT"
else
RECENT="/tmp/linux-3.0.3"
LINKDEST=" --link-dest=/tmp/linux-3.0.3"
fi
$RSYNC -aplxo $LINKDEST $RECENT/ $DATE/
if (( ${#BDATES[*]} >= $REVISIONS ))
then
DELFIRST=$(( ${#BDATES[*]} - $REVISIONS ))
loop=0
for d in ${BDATES[*]}
do
if (( $loop < = $DELFIRST ))
then
`rm -rf $d`
fi
loop=$(($loop+1))
done
fi
}
while [ 1==1 ]
do
rsync_kernel
echo .
sleep 1
done
