This is an outdated version of the HTCondor Manual.
You can find current documentation at
http://htcondor.org/manual
.
Next:
8.1 Obtaining & Installing
Up:
HTCondor
TM
Version 7.9.6 Manual
Previous:
7.3 Macintosh OS X
Contents
Index
8. Frequently Asked Questions (FAQ)
This is where you can find quick answers to some commonly asked questions about HTCondor.
Subsections
8.1 Obtaining & Installing HTCondor
Where can I download HTCondor?
When I click to download HTCondor, it sends me back to the downloads page!
What platforms are supported?
Can I get the source code?
What is Personal HTCondor?
What do I do now? My installation of HTCondor does not work.
After an installation of HTCondor, why do the daemons refuse to start?
Why do standard universe jobs never run after an upgrade?
8.2 Setting up HTCondor
How do I set up a central manager on a machine with multiple network interfaces?
How do I get more than one job to run on my SMP machine?
How do I configure a separate policy for the CPUs of an SMP machine?
How do I set up my machines so that only specific users' jobs will run on them?
How do I configure HTCondor to run my jobs only on machines that have the right packages installed?
How do I configure HTCondor to only run jobs at night?
How do I configure HTCondor such that all machines do not produce checkpoints at the same time?
Why will the
condor_master
not run when a local configuration file is missing?
8.3 Running HTCondor Jobs
Why aren't any or all of my jobs running?
I'm at the University of Wisconsin-Madison Computer Science Dept., and I am having problems!
I'm getting a lot of e-mail from HTCondor. Can I just delete it all?
Why will my vanilla jobs only run on the machine where I submitted them from?
Why does the
requirements
expression for the job I submitted
have extra things that I did not put in my submit description file?
When I use
condor_compile
to produce a job, I get an error that says, "Internal ld was not invoked!". What does this mean?
Why might my job be preempted (evicted)?
What signals get sent to my jobs when HTCondor needs to preempt or kill them, or when I remove them from the queue? Can I tell HTCondor which signals to send?
Why does my Linux job have an enormous ImageSize and refuse to run anymore?
Why does the time output from
condor_status
appear as [?????] ?
The user condor's home directory cannot be found. Why?
HTCondor commands (including
condor_q
) are really slow. What is going on?
Where are my missing files? The command
when_to_transfer_output = ON_EXIT_OR_EVICT
is in the submit description file.
8.4 HTCondor on Windows
Will HTCondor work on a network of mixed Unix and Windows machines?
What versions of Windows will HTCondor run on?
My Windows program works fine when executed on its own, but it does not work when submitted to HTCondor.
Why is the
condor_master
daemon failing to start, giving an error about
"In StartServiceCtrlDispatcher, Error number: 1063"?
Jobs submitted from Windows give an error referring to a credential.
Jobs submitted from Unix to execute on Windows do not work properly.
When I run
condor_status
I get a communication error, or the HTCondor daemon log files report a failure to bind.
My job starts but exits right away with status 128.
How can I access network files with HTCondor on Windows?
What is wrong when
condor_off
cannot find my host, and
condor_status
does not give me a complete host name?
Does
USER_JOB_WRAPPER
work on Windows machines?
condor_store_cred
is failing, and I'm sure I'm typing my password correctly.
My submit machine cannot have more than 120 jobs running concurrently. Why?
Why do HTCondor daemons exit after logging a 10038 (WSAENOTSOCK) error on some machines?
Why do HTCondor daemons exit with "Unexpected performance counter size", "unable to spawn the ProcD" or "loadavg thread died, restarting. (exit code=2)" errors?
Why does the Windows Installer fail with ``Error 2738. Could not access VBScript run time for custom action''?
Why does HTCondor sometimes fail to parse floating point numbers?
8.5 Grid Computing
What must be installed to access grid resources?
I am the administrator at Physics, and I have a 64-node cluster running HTCondor. The administrator at Chemistry is also running HTCondor on her 64-node cluster. We would like to be able to share resources. How do we do this?
Using my Globus gatekeeper to submit jobs to the HTCondor pool does not work. What is wrong?
8.6 Managing Large Workflows
How do I get meaningful output from
condor_q
with so many jobs in the queue?
What does HTCondor offer that can help with running a large number of jobs?
8.7 Troubleshooting
If I see
PERMISSION DENIED
in my log files, what does that mean?
What happens if the central manager crashes?
Why did the
condor_schedd
daemon die and restart?
When I ssh/telnet to a machine to check particulars of how HTCondor is doing something, it is always vacating or unclaimed when I know a job had been running there!
What is wrong? I get no output from
condor_status
, but the HTCondor daemons are running.
Why does HTCondor leave mail processes around?
8.8 Other questions
Is there an HTCondor mailing-list?
My question isn't in the FAQ!
htcondor-admin@cs.wisc.edu